update results by stateless6

This commit is contained in:
Guo Liyong 2022-05-28 01:31:52 +08:00
parent b3329f333a
commit 093fbd1234

View File

@ -5,12 +5,25 @@ train-clean-100 subset as training data.
## Distillation with hubert
### 2022-05-27
Related models/log/tensorboard is uploaded to:
https://huggingface.co/GuoLiyong/stateless6_baseline_vs_disstillation
Decoding method is modified beam search. Epoch is 0-based when doing these experiments.
Following results are otained by ./istillation_with_hubert.sh
The only differences is in pruned_transducer_stateless6/train.py.
For baseline: set enable_distillation=False
For distillation: set enable_distillation=True (the default)
Decoding method is modified beam search.
| | test-clean | test-other | comment |
|-------------------------------------|------------|------------|------------------------------------------|
| baseline no vq distillation | 7.08 | 18.66 | --epoch 19, --avg 10, --max-duration 200 |
| distillation with hubert | 5.68 | 15.80 | --epoch 19, --avg 10, --max-duration 200 |
| baseline no vq distillation | 7.09 | 18.88 | --epoch 20, --avg 10, --max-duration 200 |
| baseline no vq distillation | 6.83 | 18.19 | --epoch 30, --avg 10, --max-duration 200 |
| baseline no vq distillation | 6.73 | 17.79 | --epoch 40, --avg 10, --max-duration 200 |
| baseline no vq distillation | 6.75 | 17.68 | --epoch 50, --avg 10, --max-duration 200 |
| distillation with hubert | 5.82 | 15.98 | --epoch 20, --avg 10, --max-duration 200 |
| distillation with hubert | 5.52 | 15.15 | --epoch 30, --avg 10, --max-duration 200 |
| distillation with hubert | 5.45 | 14.94 | --epoch 40, --avg 10, --max-duration 200 |
| distillation with hubert | 5.50 | 14.77 | --epoch 50, --avg 10, --max-duration 200 |
## Conformer encoder + embedding decoder