Zengwei Yao
08473a17aa
Modify init ( #301 )
...
* update icefall/__init__.py to import more common functions.
* update icefall/__init__.py
* make imports style consistent.
* exclude black check for icefall/__init__.py in pyproject.toml.
2022-04-10 23:29:28 +08:00
Fangjun Kuang
78b8792d1d
Fix potential bugs in PyTorch that exist in label_smoothing. ( #300 )
2022-04-08 13:41:33 +08:00
Fangjun Kuang
7c0070e6f6
Display torch version in the training log. ( #299 )
2022-04-08 11:39:54 +08:00
Zengwei Yao
ceeb95bcb8
update icefall/__init__.py to import more common functions. ( #294 )
2022-04-06 11:55:29 +08:00
Wei Kang
cb3ba16f2b
Fix aishell prepare.sh when using pre-download data ( #291 )
2022-04-05 10:22:49 +08:00
Fangjun Kuang
87cf9231ea
Support specifying iteration number of checkpoints for decoding. ( #289 )
2022-04-03 13:02:08 +08:00
Zengwei Yao
0b6a2213c3
Modify icefall/__init__.py. ( #287 )
...
* Modify icefall/__init__.py to import common functions defined in icefall/utils.py.
* Modify icefall/__init__.py and .flake8.
2022-04-02 15:01:45 +08:00
Fangjun Kuang
e7493ede90
Don't use a lambda for dataloader's worker_init_fn. ( #284 )
...
* Don't use a lambda for dataloader's worker_init_fn.
2022-03-31 20:32:00 +08:00
Fangjun Kuang
9a11808ed3
Set the seed for dataloader. ( #282 )
...
Also, suppress torch warnings about division by truncation.
2022-03-31 16:48:46 +08:00
LIyong.Guo
fc40bfea82
fix typo of torch.eig ( #281 )
...
Co-authored-by: glynpu <glynwpu@qq.com>
2022-03-31 10:43:46 +08:00
Fangjun Kuang
2045125fd9
Fix CI. ( #280 )
...
* Fix CI.
2022-03-31 10:43:02 +08:00
Fangjun Kuang
981b064007
Update doc to clarify the installation order of dependencies. ( #279 )
2022-03-30 18:50:54 +08:00
Mingshuang Luo
f686635b54
Update diagnostics ( #260 )
...
* update diagnostics.py
2022-03-30 14:52:55 +08:00
Fangjun Kuang
395a3f952b
Batch decoding for models trained with optimized_transducer ( #267 )
...
* Add greedy search in batch mode.
* Add modified beam search in batch mode.
2022-03-23 19:11:34 +08:00
Fangjun Kuang
3ae7265737
More fixes to the checkpoint code. ( #266 )
2022-03-23 14:37:54 +08:00
Fangjun Kuang
6a091da0b0
Minor fixes for saving checkpoints. ( #265 )
...
* Minor fixes for saving checkpoints.
* Fix loading checkpoints saved by previous code.
2022-03-23 12:22:05 +08:00
Fangjun Kuang
8c7995d493
Support modified beam search in batch mode. ( #264 )
...
* Support modified beam search in batch mode.
* Update k2 versions in GitHub CI.
2022-03-22 15:14:04 +08:00
Fangjun Kuang
d5c78a2238
Implement greedy search in batch mode for transducer decoding. ( #262 )
2022-03-22 10:32:22 +08:00
Wei Kang
b2b4d9e0b6
Add fast beam search decoding ( #250 )
...
* Add fast beam search decoding
* Minor fixes
* Minor fixes
* Minor fixes
* Fix comments
* Fix comments
2022-03-21 16:22:25 +08:00
Fangjun Kuang
ae564f91e6
Periodically saving checkpoint after processing given number of batches ( #259 )
...
* Periodically saving checkpoint after processing given number of batches.
2022-03-20 23:51:33 +08:00
Fangjun Kuang
910e6c9306
Minor fixes to tedlimu3 to make ./prepare.sh
working. ( #258 )
2022-03-20 20:26:03 +08:00
Mingshuang Luo
ad28c8c5eb
Tedlium3 transducer stateless ( #233 )
...
* add tedlium3 transducer-stateless
2022-03-18 11:39:06 +08:00
Mingshuang Luo
518ec6414a
Update diagnostics.py ( #254 )
...
* update diagnostics.py
* do some changes
2022-03-16 20:17:45 +08:00
Fangjun Kuang
a7643301ec
Cache pip packages for GitHub actions ( #253 )
...
* Cache pip packages in GitHub actions.
2022-03-15 15:34:21 +08:00
Mingshuang Luo
d0d806560f
Change for asr_datamodule.py ( #241 )
...
* change for asr_datamodule.py
* fix style check
* do a fix
2022-03-14 00:30:58 +08:00
Fangjun Kuang
bb7f6ed6b7
Add modified beam search for pruned rnn-t. ( #248 )
...
* Add modified beam search for pruned rnn-t.
* Fix style issues.
* Update RESULTS.md.
* Fix typos.
* Minor fixes.
* Test the pre-trained model using GitHub actions.
* Let the user install optimized_transducer on her own.
* Fix errors in GitHub CI.
2022-03-12 16:16:55 +08:00
Fangjun Kuang
2f4e71f433
Add force alignment for stateless transducer. ( #239 )
...
* Add force alignment for stateless transducer.
* Add more documentation.
* Compute word starting time from framewise token alignment.
* Update README to include force alignment information.
* Fix typos.
* Fix more typos.
* Fixes after review.
2022-03-12 16:16:15 +08:00
Fangjun Kuang
1603744469
Refactor conformer. ( #237 )
2022-03-05 19:26:06 +08:00
yaozengwei
ad62981765
Add diagnostics ( #230 )
...
* Adding diagnostics code...
* Move diagnostics code from local dir to the shared icefall dir
* Remove the diagnostics code in the local dir
* Update docs of arguments, and remove stats_types() function in TensorDiagnosticOptions object.
* Update docs of arguments.
* Add copyright information.
* Corrected the time in copyright information.
Co-authored-by: Daniel Povey <dpovey@gmail.com>
2022-03-04 15:38:23 +08:00
Fangjun Kuang
2f0fbf430c
Remove duplicate files. ( #236 )
2022-03-04 11:56:31 +08:00
Fangjun Kuang
3ec219dfa0
Add stateless transducer tutorial. ( #235 )
...
* WIP: Add stateless transducer tutorial.
* Add more doc.
* Minor fixes.
2022-03-03 22:33:47 +08:00
Fangjun Kuang
1ff6196c44
Fix joiner ( #234 )
...
* Add tests for Joiner
* Remove duplicate files.
2022-03-02 16:41:14 +08:00
Fangjun Kuang
50d2281524
Add modified transducer loss for AIShell dataset ( #219 )
...
* Add modified transducer for aishell.
* Minor fixes.
* Add extra data in transducer training.
The extra data is from http://www.openslr.org/62/
* Update export.py and pretrained.py
* Update CI to install pretrained models with aishell.
* Update results.
* Update results.
* Update README.
* Use symlinks to avoid copies.
2022-03-02 16:02:38 +08:00
Fangjun Kuang
05cb297858
Update result for full libri + GigaSpeech using transducer_stateless. ( #231 )
2022-03-01 17:01:46 +08:00
Fangjun Kuang
72f838dee1
Update results for transducer_stateless after training for more epochs. ( #207 )
2022-03-01 16:35:02 +08:00
PF Luo
ac7c2d84bc
minor fix for aishell recipe ( #223 )
...
* just remove unnecessary torch.sum
* minor fixs for aishell
2022-02-23 08:33:20 +08:00
Fangjun Kuang
2332ba312d
Begin to use multiple datasets in training ( #213 )
...
* Begin to use multiple datasets.
* Finish preparing training datasets.
* Minor fixes
* Copy files.
* Finish training code.
* Display losses for gigaspeech and librispeech separately.
* Fix decode.py
* Make the probability to select a batch from GigaSpeech configurable.
* Update results.
* Minor fixes.
2022-02-21 15:27:27 +08:00
Fangjun Kuang
1c35ae1dba
Reset seed at the beginning of each epoch. ( #221 )
...
* Reset seed at the beginning of each epoch.
* Use a different seed for each epoch.
2022-02-21 15:16:39 +08:00
Fangjun Kuang
cbf8c18ebd
Minor fixes for aishell ( #218 )
...
* Minor fixes to aishell.
* Minor fixes.
2022-02-19 22:28:19 +08:00
PF Luo
277cc3f9bf
update aishell-1 recipe with k2.rnnt_loss ( #215 )
...
* update aishell-1 recipe with k2.rnnt_loss
* fix flak8 style
* typo
* add pretrained model link to result.md
2022-02-19 15:56:39 +08:00
Duo Ma
827b9df51a
Updated Aishell-1 transducer-stateless result ( #217 )
...
* Update RESULTS.md
* Update RESULTS.md
2022-02-19 15:56:04 +08:00
Wei Kang
b702281e90
Use k2 pruned transducer loss to train conformer-transducer model ( #194 )
...
* Using k2 pruned version transducer loss to train model
* Fix style
* Minor fixes
2022-02-17 13:33:54 +08:00
Wang, Guanbo
e8eb408760
Incremental pruning threshold ( #214 )
...
* Incremental pruning threshold
* flake8
* black
* minor fix
2022-02-16 16:59:27 +08:00
Wang, Guanbo
70a3c56a18
Fix librispeech train.py ( #211 )
...
* fix librispeech train.py
* remove note
2022-02-09 16:42:28 +08:00
Wang, Guanbo
be1c86b06c
print num_frame as %.2f ( #204 )
2022-02-08 14:56:58 +08:00
Fangjun Kuang
27fa5f05d3
Update git SHA-1 in RESULTS.md for transducer_stateless. ( #202 )
2022-02-07 18:45:45 +08:00
Fangjun Kuang
a8150021e0
Use modified transducer loss in training. ( #179 )
...
* Use modified transducer loss in training.
* Minor fix.
* Add modified beam search.
* Add modified beam search.
* Minor fixes.
* Fix typo.
* Update RESULTS.
* Fix a typo.
* Minor fixes.
2022-02-07 18:37:36 +08:00
Wei Kang
35ecd7e562
Fix torch.nn.Embedding error for torch below 1.8.0 ( #198 )
2022-02-06 21:59:54 +08:00
Wei Kang
5ae80dfca7
Minor fixes ( #193 )
2022-01-27 18:01:17 +08:00
Piotr Żelasko
8e6fd97c6b
Merge pull request #185 from pzelasko/feature/libri-conformer-phone-ctc
...
Fix using `lang_phone` in conformer CTC training
2022-01-24 18:08:15 -05:00