add readme and results

This commit is contained in:
marcoyang 2024-03-29 17:31:33 +08:00
parent 9e9bc7593e
commit 39e7de47b1
2 changed files with 56 additions and 0 deletions

12
egs/audioset/AT/README.md Normal file
View File

@ -0,0 +1,12 @@
# Introduction
This is an audio tagging recipe. It aims at predicting the sound events of an audio clip.
[./RESULTS.md](./RESULTS.md) contains the latest results.
# Zipformer
| Encoder | Feature type |
| --------| -------------|
| Zipformer | Frame level fbank|

View File

@ -0,0 +1,44 @@
## Results
### zipformer
See <https://github.com/k2-fsa/icefall/pull/1421> for more details
[zipformer](./zipformer)
You can find a pretrained model, training logs, decoding logs, and decoding results at:
<https://huggingface.co/marcoyang/icefall-audio-tagging-audioset-zipformer-2024-03-12#/>
The model achieves the following mean averaged precision on AudioSet:
| Model | mAP |
| ------ | ------- |
| Zipformer-AT | 45.1 |
The training command is:
```bash
export CUDA_VISIBLE_DEVICES="4,5,6,7"
subset=full
python zipformer/train.py \
--world-size 4 \
--num-epochs 50 \
--exp-dir zipformer/exp_at_as_${subset} \
--start-epoch 1 \
--use-fp16 1 \
--num-events 527 \
--audioset-subset $subset \
--max-duration 1000 \
--enable-musan True \
--master-port 13455
```
The evaluation command is:
```bash
python zipformer/evaluate.py \
--epoch 32 \
--avg 8 \
--exp-dir zipformer/exp_at_as_full \
--max-duration 500
```