mirror of
https://github.com/k2-fsa/icefall.git
synced 2025-08-26 18:24:18 +00:00
added RESULTS.md
This commit is contained in:
parent
78b39d9a7d
commit
60691efddf
@ -19,7 +19,9 @@ The following table lists the differences among them.
|
||||
| `transducer_stateless_modified` | Conformer | Embedding + Conv1d | with modified transducer from `optimized_transducer` |
|
||||
| `transducer_stateless_modified-2` | Conformer | Embedding + Conv1d | with modified transducer from `optimized_transducer` + extra data |
|
||||
| `pruned_transducer_stateless3` | Conformer (reworked) | Embedding + Conv1d | pruned RNN-T + reworked model with random combiner + using aidatatang_20zh as extra data|
|
||||
| `pruned_transducer_stateless7` | Zipformer | Embedding | pruned RNN-T + zipformer encoder + stateless decoder with context-size 1 |
|
||||
| `pruned_transducer_stateless7` | Zipformer | Embedding | pruned RNN-T + zipformer encoder + stateless decoder with context-size set to 1 |
|
||||
| `zipformer` | Upgraded Zipformer | Embedding + Conv1d | The latest recipe with context-size set to 1 |
|
||||
|
||||
|
||||
The decoder in `transducer_stateless` is modified from the paper
|
||||
[Rnn-Transducer with Stateless Prediction Network](https://ieeexplore.ieee.org/document/9054419/).
|
||||
|
@ -5,3 +5,15 @@ transcripts, collected from Cantonese audiobooks from Hong Kong. It comprises ph
|
||||
politics, education, culture, lifestyle and family domains, covering a wide range of topics.
|
||||
|
||||
Manuscript can be found at: https://arxiv.org/abs/2201.02419
|
||||
|
||||
# Transducers
|
||||
|
||||
|
||||
|
||||
| | Encoder | Decoder | Comment |
|
||||
|---------------------------------------|---------------------|--------------------|-----------------------------|
|
||||
| `zipformer` | Upgraded Zipformer | Embedding + Conv1d | The latest recipe with context-size set to 1 |
|
||||
|
||||
The decoder is modified from the paper
|
||||
[Rnn-Transducer with Stateless Prediction Network](https://ieeexplore.ieee.org/document/9054419/).
|
||||
We place an additional Conv1d layer right after the input embedding layer.
|
||||
|
41
egs/mdcc/ASR/RESULTS.md
Normal file
41
egs/mdcc/ASR/RESULTS.md
Normal file
@ -0,0 +1,41 @@
|
||||
## Results
|
||||
|
||||
#### Zipformer
|
||||
|
||||
See <https://github.com/k2-fsa/icefall/pull/1537>
|
||||
|
||||
[./zipformer](./zipformer)
|
||||
|
||||
##### normal-scaled model, number of model parameters: 74470867, i.e., 74.47 M
|
||||
|
||||
| | test | valid | comment |
|
||||
|------------------------|------|-------|-----------------------------------------|
|
||||
| greedy search | 7.45 | 7.51 | --epoch 45 --avg 35 |
|
||||
| modified beam search | 6.68 | 6.73 | --epoch 45 --avg 35 |
|
||||
| fast beam search | 7.22 | 7.28 | --epoch 45 --avg 35 |
|
||||
|
||||
The training command:
|
||||
|
||||
```
|
||||
export CUDA_VISIBLE_DEVICES="0,1,2,3"
|
||||
|
||||
./zipformer/train.py \
|
||||
--world-size 4 \
|
||||
--start-epoch 1 \
|
||||
--num-epochs 50 \
|
||||
--use-fp16 1 \
|
||||
--exp-dir ./zipformer/exp \
|
||||
--max-duration 1000
|
||||
```
|
||||
|
||||
The decoding command:
|
||||
|
||||
```
|
||||
./zipformer/decode.py \
|
||||
--epoch 45 \
|
||||
--avg 35 \
|
||||
--exp-dir ./zipformer/exp \
|
||||
--decoding-method greedy_search # modified_beam_search
|
||||
```
|
||||
|
||||
The pretrained model is available at: https://huggingface.co/zrjin/icefall-asr-mdcc-zipformer-2024-03-11/
|
Loading…
x
Reference in New Issue
Block a user