390 Commits

Author SHA1 Message Date
Daniel Povey
beaf5bfbab Merge specaug change from Mingshuang. 2022-02-08 19:42:23 +08:00
Daniel Povey
395065eb11 Merge branch 'spec-augment-change' of https://github.com/luomingshuang/icefall into attention_relu_specaug 2022-02-08 19:40:33 +08:00
Mingshuang Luo
3323cabf46 Experiments based on SpecAugment change 2022-02-08 14:25:31 +08:00
Fangjun Kuang
27fa5f05d3
Update git SHA-1 in RESULTS.md for transducer_stateless. (#202) 2022-02-07 18:45:45 +08:00
Fangjun Kuang
a8150021e0
Use modified transducer loss in training. (#179)
* Use modified transducer loss in training.

* Minor fix.

* Add modified beam search.

* Add modified beam search.

* Minor fixes.

* Fix typo.

* Update RESULTS.

* Fix a typo.

* Minor fixes.
2022-02-07 18:37:36 +08:00
Daniel Povey
a859dcb205 Remove learnable offset, use relu instead. 2022-02-07 12:14:48 +08:00
Wei Kang
35ecd7e562
Fix torch.nn.Embedding error for torch below 1.8.0 (#198) 2022-02-06 21:59:54 +08:00
Daniel Povey
48a764eccf Add min in q,k,v of attention 2022-02-06 21:19:37 +08:00
Daniel Povey
8f8ec223a7 Changes to fbank computation, use lilcom chunky writer 2022-02-06 21:18:40 +08:00
pkufool
fcd25bdfff Fix torch.nn.Embedding error for torch below 1.8.0 2022-02-06 18:22:56 +08:00
Wei Kang
5ae80dfca7
Minor fixes (#193) 2022-01-27 18:01:17 +08:00
Piotr Żelasko
1731cc37bb Black 2022-01-24 10:20:22 -05:00
Piotr Żelasko
f92c24a73a
Merge branch 'master' into feature/libri-conformer-phone-ctc 2022-01-24 10:18:56 -05:00
Piotr Żelasko
565c1d8413 Address code review 2022-01-24 10:17:47 -05:00
Piotr Żelasko
1d5fe8afa4 flake8 2022-01-21 17:27:02 -05:00
Piotr Żelasko
f0f35e6671 black 2022-01-21 17:22:41 -05:00
Piotr Żelasko
f28951f2b6 Add an assertion 2022-01-21 17:16:49 -05:00
Piotr Żelasko
3d109b121d Remove train_phones.py and modify train.py instead 2022-01-21 17:08:53 -05:00
Fangjun Kuang
d6050eb02e Fix calling optimized_transducer after new release. (#182) 2022-01-21 08:18:50 +08:00
Guanbo Wang
e6017bae39 Merge remote-tracking branch 'upstream/master' into gigaspeech_recipe 2022-01-20 02:11:04 -05:00
Fangjun Kuang
f94ff19bfe
Refactor beam search and update results. (#177) 2022-01-18 16:40:19 +08:00
wgb14
652646ab8f use pretrained language model and lexicon 2022-01-17 19:05:51 -05:00
wgb14
72abd38f27 use KaldifeatFbank to compute fbank for musan 2022-01-17 18:54:16 -05:00
Fangjun Kuang
273e5fb2f3
Update git SHA1 for transducer_stateless model. (#174) 2022-01-10 11:58:17 +08:00
Fangjun Kuang
4c1b3665ee
Use optimized_transducer to compute transducer loss. (#162)
* WIP: Use optimized_transducer to compute transducer loss.

* Minor fixes.

* Fix decoding.

* Fix decoding.

* Add RESULTS.

* Update RESULTS.

* Update CI.

* Fix sampling rate for yesno recipe.
2022-01-10 11:54:58 +08:00
Piotr Żelasko
319e120869
Update feature config (compatible with Lhotse PR #525) (#172)
* Update feature config (compatible with Lhotse PR #525)

* black
2022-01-10 11:39:28 +08:00
Lucky Wong
6caff5fd38
minor fixes (#169)
* Fix no attribute 'data' error.

* minor fixes
2022-01-06 10:24:16 +08:00
pingfengluo
ea8af0ee9a
add transducer_stateless with char unit to AIShell (#164) 2022-01-01 18:32:08 +08:00
wgb14
6e5b189fc5 DynamicBucketingSampler 2021-12-29 15:22:46 -05:00
Fangjun Kuang
413b2e8569
Add git sha1 to RESULTS.md for conformer encoder + stateless decoder. (#160) 2021-12-28 12:04:01 +08:00
Fangjun Kuang
14c93add50
Remove batchnorm, weight decay, and SOS from transducer conformer encoder (#155)
* Remove batchnorm, weight decay, and SOS.

* Make --context-size configurable.

* Update results.
2021-12-27 16:01:10 +08:00
Fangjun Kuang
8187d6236c
Minor fix to maximum number of symbols per frame for RNN-T decoding. (#157)
* Minor fix to maximum number of symbols per frame RNN-T decoding.

* Minor fixes.
2021-12-24 21:48:40 +08:00
Fangjun Kuang
5b6699a835
Minor fixes to the RNN-T Conformer model (#152)
* Disable weight decay.

* Remove input feature batchnorm..

* Replace BatchNorm in the Conformer model with LayerNorm.

* Use tanh in the joint network.

* Remove sos ID.

* Reduce the number of decoder layers from 4 to 2.

* Minor fixes.

* Fix typos.
2021-12-23 13:54:25 +08:00
Fangjun Kuang
fb6a57e9e0
Increase the size of the context in the RNN-T decoder. (#153) 2021-12-23 07:55:02 +08:00
Fangjun Kuang
cb04c8a750
Limit the number of symbols per frame in RNN-T decoding. (#151) 2021-12-18 11:00:42 +08:00
Fangjun Kuang
1d44da845b
RNN-T Conformer training for LibriSpeech (#143)
* Begin to add RNN-T training for librispeech.

* Copy files from conformer_ctc.

Will edit it.

* Use conformer/transformer model as encoder.

* Begin to add training script.

* Add training code.

* Remove long utterances to avoid OOM when a large max_duraiton is used.

* Begin to add decoding script.

* Add decoding script.

* Minor fixes.

* Add beam search.

* Use LSTM layers for the encoder.

Need more tunings.

* Use stateless decoder.

* Minor fixes to make it ready for merge.

* Fix README.

* Update RESULT.md to include RNN-T Conformer.

* Minor fixes.

* Fix tests.

* Minor fixes.

* Minor fixes.

* Fix tests.
2021-12-18 07:42:51 +08:00
wgb14
bea78f6094 lazy loading and use SingleCutSampler 2021-12-17 00:38:52 -05:00
Guanbo Wang
532309bf72 Add conformer.py without pre-commit checking 2021-12-16 20:20:41 -05:00
wgb14
76a289126f add conformer training recipe 2021-12-16 20:18:02 -05:00
Guanbo Wang
71ef6a9e11 Merge remote-tracking branch 'upstream/master' into gigaspeech_recipe 2021-12-16 19:13:14 -05:00
Wei Kang
76a51bf037
Fix aishell tdnn_lstm_ctc decoding (#149) 2021-12-14 14:42:58 +08:00
Wei Kang
a183d5bfd7
Remove batchnorm (#147)
* Remove batch normalization

* Minor fixes

* Fix typo

* Fix comments

* Add assertion for use_feat_batchnorm
2021-12-14 08:20:03 +08:00
Fangjun Kuang
95af039733
RNN-T training for yesno. (#141)
* RNN-T training for yesno.

* Rename Jointer to Joiner.
2021-12-07 21:44:37 +08:00
Fangjun Kuang
1aff64b708
Apply layer normalization to the output of each gate in LSTM/GRU. (#139)
* Apply layer normalization to the output of each gate in LSTM.

* Apply layer normalization to the output of each gate in GRU.

* Add projection support to LayerNormLSTMCell.

* Add GPU tests.

* Use typeguard.check_argument_types() to validate type annotations.

* Add typeguard as a requirement.

* Minor fixes.

* Fix CI.

* Fix CI.

* Fix test failures for torch 1.8.0

* Fix errors.
2021-12-07 18:38:03 +08:00
pingfengluo
d1adc25338
Update AIShell recipe result (#140)
* add MMI to AIShell

* fix MMI decode graph

* export model

* typo

* fix code style

* typo

* fix data prepare to just use train text by uid

* use a faster way to get the intersection of train and aishell_transcript_v0.8.txt

* update AIShell result

* update

* typo
2021-12-04 14:43:04 +08:00
wgb14
4316ec43d7 small fix 2021-12-03 16:34:36 -05:00
pingfengluo
89b84208aa
add phone based LF-MMI training to AIShell recipe (#137)
* add MMI to AIShell

* fix MMI decode graph

* export model

* typo

* fix code style

* typo
2021-12-02 12:32:23 +08:00
wgb14
64bd3f7df4 set audio duration mismatch tolerance to 0.01 2021-12-01 17:49:46 -05:00
Fangjun Kuang
8109c2b913 Split manifests into 2000 pieces. 2021-11-30 12:04:15 +08:00
Fangjun Kuang
ec591698b0
Associate a cut with token alignment (without repeats) (#125)
* WIP: Associate a cut with token alignment (without repeats)

* Save framewise alignments with/without repeats.

* Minor fixes.
2021-11-29 18:50:54 +08:00