505 Commits

Author SHA1 Message Date
Daniel Povey
0bf538a4a3 Add negentropy_penalty, on individual dims. 2022-05-10 13:20:10 +08:00
Fangjun Kuang
cd460f7bf1
Stringify torch.__version__ before serializing it. (#354) 2022-05-07 17:18:34 +08:00
Zengwei Yao
20f092e709
Support decoding with averaged model when using --iter (#353)
* support decoding with averaged model when using --iter

* minor fix

* monir fix of copyright date
2022-05-07 13:09:11 +08:00
Mingshuang Luo
f783e10dc8
Do some changes for aishell/ASR/transducer stateless/export.py (#347)
* do some changes for aishell/ASR/transducer_stateless/export.py
2022-05-07 11:09:31 +08:00
Zengwei Yao
c059ef3169
Keep model_avg on cpu (#348)
* keep model_avg on cpu

* explicitly convert model_avg to cpu

* minor fix

* remove device convertion for model_avg

* modify usage of the model device in train.py

* change model.device to next(model.parameters()).device for decoding

* assert params.start_epoch>0

* assert params.start_epoch>0, params.start_epoch
2022-05-07 10:42:34 +08:00
Guanbo Wang
8e3c89076e Bug fix (#352) 2022-05-07 08:10:54 +08:00
Fangjun Kuang
32f05c00e3
Save batch to disk on exception. (#350) 2022-05-06 17:49:40 +08:00
Zengwei Yao
00c48ec1f3
Model average (#344)
* First upload of model average codes.

* minor fix

* update decode file

* update .flake8

* rename pruned_transducer_stateless3 to pruned_transducer_stateless4

* change epoch number counter starting from 1 instead of 0

* minor fix of pruned_transducer_stateless4/train.py

* refactor the checkpoint.py

* minor fix, update docs, and modify the epoch number to count from 1 in the pruned_transducer_stateless4/decode.py

* update author info

* add docs of the scaling in function average_checkpoints_with_averaged_model
2022-05-05 21:20:04 +08:00
Fangjun Kuang
8635fb4334
Fix decoding for gigaspeech in the libri + giga setup. (#345) 2022-05-05 20:58:46 +08:00
Daniel Povey
0f7ff7470f Switch sampling to new C++/CUDA backend 2022-05-05 15:44:04 +08:00
Fangjun Kuang
e1c3e98980
Save batch to disk on OOM. (#343)
* Save batch to disk on OOM.

* minor fixes

* Fixes after review.

* Fix style issues.
2022-05-05 15:09:23 +08:00
Fangjun Kuang
9ddbc681e7
Validate generated manifest files. (#338) 2022-05-03 07:08:33 +08:00
Fangjun Kuang
6af15914fa
Validate generated manifest files. (#338) 2022-05-03 07:02:54 +08:00
Fangjun Kuang
6dc2e04462
Update results. (#340)
* Update results.

* Typo fixes.
2022-04-29 15:49:45 +08:00
Fangjun Kuang
ac84220de9
Modified conformer with multi datasets (#312)
* Copy files for editing.

* Use librispeech + gigaspeech with modified conformer.

* Support specifying number of workers for on-the-fly feature extraction.

* Feature extraction code for GigaSpeech.

* Combine XL splits lazily during training.

* Fix warnings in decoding.

* Add decoding code for GigaSpeech.

* Fix decoding the gigaspeech dataset.

We have to use the decoder/joiner networks for the GigaSpeech dataset.

* Disable speed perturbe for XL subset.

* Compute the Nbest oracle WER for RNN-T decoding.

* Minor fixes.

* Minor fixes.

* Add results.

* Update results.

* Update CI.

* Update results.

* Fix style issues.

* Update results.

* Fix style issues.
2022-04-29 15:40:30 +08:00
Fangjun Kuang
caab6cfd92
Support specifying iteration number of checkpoints for decoding. (#336)
See also #289
2022-04-28 14:09:22 +08:00
Fangjun Kuang
9aeea3e1af
Support averaging models with weight tying. (#333) 2022-04-26 13:32:03 +08:00
Daniel Povey
551786b9bd Merge branch 'model-averaging-shared-params' of https://github.com/csukuangfj/icefall into knowledge_base_1b_merge 2022-04-26 13:18:09 +08:00
Fangjun Kuang
0da522cc4c Support averaging models with weight tying. 2022-04-26 13:14:08 +08:00
Daniel Povey
eba025a6b4 Mess with thresholds for printing 2022-04-26 10:39:35 +08:00
Daniel Povey
3ba081e6d9 Add more custom_fwd,custom_bwd' 2022-04-25 23:58:34 +08:00
Daniel Povey
2c4478b6d1 Fix for half precision 2022-04-25 23:03:34 +08:00
Daniel Povey
e718c7ac88 Remove unnecessary copy 2022-04-25 20:41:00 +08:00
Daniel Povey
f6619a0b20 Remove unnecessary check 2022-04-25 20:37:06 +08:00
Daniel Povey
7d457a7781 Add some diagnostics 2022-04-25 19:34:19 +08:00
Daniel Povey
edaaec09cd Update backprop of sampling.py to be slightly more efficient. 2022-04-25 19:32:11 +08:00
pehonnet
9a98e6ced6
fix fp16 option in example usage (#332) 2022-04-25 18:51:53 +08:00
Daniel Povey
bbfa484196 Decrease model size, baseline is one Fangjun is running.. 2022-04-25 17:07:20 +08:00
Daniel Povey
aea116ea25 Change printing-prob, initial scales 2022-04-25 14:02:43 +08:00
Daniel Povey
bb7cb82b04 Some fixes/refactoring, make parameters shared 2022-04-25 13:55:27 +08:00
Daniel Povey
0d40b4617a Add knowledge-base lookup to model 2022-04-25 13:40:47 +08:00
Daniel Povey
a359bfe504 Test with CUDA, bug fixes 2022-04-25 13:19:09 +08:00
Daniel Povey
f8c7e6ffb3 Add some training code. Seems to be training successfully... 2022-04-24 23:19:46 +08:00
Daniel Povey
df39fc6783 Fix devices 2022-04-24 22:48:52 +08:00
Daniel Povey
a266922678 First version of sampling.py, tests run. 2022-04-24 22:29:11 +08:00
Daniel Povey
fe5586e847 Change dirname 2022-04-24 19:51:27 +08:00
Daniel Povey
65cd1059f3 Init pruned2_knowledge dir 2022-04-24 19:50:22 +08:00
whsqkaak
d766dc5aee
Fix some typos. (#329) 2022-04-22 15:54:59 +08:00
Fangjun Kuang
3607c516d6
Update results for torchaudio RNN-T. (#322) 2022-04-20 11:15:10 +08:00
Fangjun Kuang
fce7f3cd9a
Support computing RNN-T loss with torchaudio (#316) 2022-04-19 18:47:13 +08:00
Wei Kang
021c79824e
Add LG decoding (#277)
* Add LG decoding

* Add log weight pushing

* Minor fixes
2022-04-19 17:23:46 +08:00
Wang, Guanbo
5fe58de43c
GigaSpeech recipe (#120)
* initial commit

* support download, data prep, and fbank

* on-the-fly feature extraction by default

* support BPE based lang

* support HLG for BPE

* small fix

* small fix

* chunked feature extraction by default

* Compute features for GigaSpeech by splitting the manifest.

* Fixes after review.

* Split manifests into 2000 pieces.

* set audio duration mismatch tolerance to 0.01

* small fix

* add conformer training recipe

* Add conformer.py without pre-commit checking

* lazy loading and use SingleCutSampler

* DynamicBucketingSampler

* use KaldifeatFbank to compute fbank for musan

* use pretrained language model and lexicon

* use 3gram to decode, 4gram to rescore

* Add decode.py

* Update .flake8

* Delete compute_fbank_gigaspeech.py

* Use BucketingSampler for valid and test dataloader

* Update params in train.py

* Use bpe_500

* update params in decode.py

* Decrease num_paths while CUDA OOM

* Added README

* Update RESULTS

* black

* Decrease num_paths while CUDA OOM

* Decode with post-processing

* Update results

* Remove lazy_load option

* Use default `storage_type`

* Keep the original tolerance

* Use split-lazy

* black

* Update pretrained model

Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
2022-04-14 16:07:22 +08:00
Mingshuang Luo
d88e786513
Changes for pretrained.py (tedlium3 pruned RNN-T) (#311) 2022-04-14 09:54:07 +08:00
Daniel Povey
62fbfb52d0
Merge pull request #315 from danpovey/mixprec_md300
Add results for mixed precision with max-duration 300
2022-04-13 20:23:07 +08:00
Daniel Povey
af6ae840ee Add results for mixed precision with max-duration 300 2022-04-13 20:22:11 +08:00
Daniel Povey
c0003483d3
Merge pull request #313 from glynpu/fix_comments
fix comments
2022-04-13 14:03:02 +08:00
Guo Liyong
78418ac37c fix comments 2022-04-13 13:09:24 +08:00
Daniel Povey
2a854f5607
Merge pull request #309 from danpovey/update_results
Update results; will further update this before merge
2022-04-12 12:22:48 +08:00
Daniel Povey
9ed7a169e1 Add one more epoch of full expt 2022-04-12 12:20:10 +08:00
Daniel Povey
d0a53aad48 Fix tensorboard log location 2022-04-12 11:51:15 +08:00