101 Commits

Author SHA1 Message Date
Fangjun Kuang
9fad0fd915 Minor fixes. 2021-12-17 20:39:34 +08:00
Fangjun Kuang
be493ad913 Minor fixes. 2021-12-17 20:22:26 +08:00
Fangjun Kuang
9639f6dc0a Minor fixes. 2021-12-17 20:19:36 +08:00
Fangjun Kuang
47b0f2ec2f Update RESULT.md to include RNN-T Conformer. 2021-12-17 20:16:47 +08:00
Fangjun Kuang
164321c79d Minor fixes to make it ready for merge. 2021-12-17 19:43:04 +08:00
Fangjun Kuang
f6a33a85c5 Use stateless decoder. 2021-12-17 16:48:57 +08:00
Fangjun Kuang
fcc22d3e91 Use LSTM layers for the encoder.
Need more tunings.
2021-12-17 11:58:30 +08:00
Fangjun Kuang
3174bebf07 Add beam search. 2021-12-15 18:50:29 +08:00
Fangjun Kuang
cbda811a10 Minor fixes. 2021-12-15 08:43:38 +08:00
Fangjun Kuang
e38f04e70f Add decoding script. 2021-12-13 19:49:50 +08:00
Fangjun Kuang
73ba843d0a Begin to add decoding script. 2021-12-13 17:08:27 +08:00
Fangjun Kuang
89a08b64ce Remove long utterances to avoid OOM when a large max_duraiton is used. 2021-12-13 16:41:14 +08:00
Fangjun Kuang
cd5ed7db20 Add training code. 2021-12-13 13:50:53 +08:00
Fangjun Kuang
232caf51ee Begin to add training script. 2021-12-13 11:15:35 +08:00
Fangjun Kuang
f5199d37c4 Use conformer/transformer model as encoder. 2021-12-07 23:20:59 +08:00
Fangjun Kuang
f802758fca Copy files from conformer_ctc.
Will edit it.
2021-12-07 22:25:31 +08:00
Fangjun Kuang
5802d5ad2e Begin to add RNN-T training for librispeech. 2021-12-07 22:24:18 +08:00
Fangjun Kuang
95af039733
RNN-T training for yesno. (#141)
* RNN-T training for yesno.

* Rename Jointer to Joiner.
2021-12-07 21:44:37 +08:00
Fangjun Kuang
1aff64b708
Apply layer normalization to the output of each gate in LSTM/GRU. (#139)
* Apply layer normalization to the output of each gate in LSTM.

* Apply layer normalization to the output of each gate in GRU.

* Add projection support to LayerNormLSTMCell.

* Add GPU tests.

* Use typeguard.check_argument_types() to validate type annotations.

* Add typeguard as a requirement.

* Minor fixes.

* Fix CI.

* Fix CI.

* Fix test failures for torch 1.8.0

* Fix errors.
2021-12-07 18:38:03 +08:00
pingfengluo
d1adc25338
Update AIShell recipe result (#140)
* add MMI to AIShell

* fix MMI decode graph

* export model

* typo

* fix code style

* typo

* fix data prepare to just use train text by uid

* use a faster way to get the intersection of train and aishell_transcript_v0.8.txt

* update AIShell result

* update

* typo
2021-12-04 14:43:04 +08:00
pingfengluo
89b84208aa
add phone based LF-MMI training to AIShell recipe (#137)
* add MMI to AIShell

* fix MMI decode graph

* export model

* typo

* fix code style

* typo
2021-12-02 12:32:23 +08:00
Fangjun Kuang
ec591698b0
Associate a cut with token alignment (without repeats) (#125)
* WIP: Associate a cut with token alignment (without repeats)

* Save framewise alignments with/without repeats.

* Minor fixes.
2021-11-29 18:50:54 +08:00
Fangjun Kuang
243fb9723c
Fix an error introduced while supporting torchscript. (#134)
Should be `G.dummy = 1`, not `G["dummy"] = 1`.
2021-11-27 09:07:04 +08:00
Fangjun Kuang
0e541f5b5d
Print hostname and IP address to the log. (#131)
We are using multiple machines to do various experiments. It makes
life easier to know which experiment is running on which machine
if we also log the IP and hostname of the machine.
2021-11-26 11:25:59 +08:00
LIyong.Guo
00e2f0ade8
Draft streaming decoding (#89)
* reusable parts from conformer_ctc

* streaming conformer code

* a trained model
2021-11-24 19:35:18 +08:00
Lucky Wong
769a9791ec
Fix no attribute 'data' error. (#129) 2021-11-22 18:31:04 +08:00
Wei Kang
4151cca147
Add torch script support for Aishell and update documents (#124)
* Add aishell recipe

* Remove unnecessary code and update docs

* adapt to k2 v1.7, add docs and results

* Update conformer ctc model

* Update docs, pretrained.py & results

* Fix code style

* Fix code style

* Fix code style

* Minor fix

* Minor fix

* Fix pretrained.py

* Update pretrained model & corresponding docs

* Export torch script model for Aishell

* Add C++ deployment docs

* Minor fixes

* Fix unit test

* Update Readme
2021-11-19 16:37:05 +08:00
Wei Kang
30c43b7f69
Add aishell recipe (#30)
* Add aishell recipe

* Remove unnecessary code and update docs

* adapt to k2 v1.7, add docs and results

* Update conformer ctc model

* Update docs, pretrained.py & results

* Fix code style

* Fix code style

* Fix code style

* Minor fix

* Minor fix

* Fix pretrained.py

* Update pretrained model & corresponding docs
2021-11-18 10:00:47 +08:00
Fangjun Kuang
0660d12e4e
Fix computing WERs for empty hypotheses (#118)
* Fix computing WERs when empty lattices are generated.

* Minor fixes.
2021-11-17 19:25:47 +08:00
Fangjun Kuang
336283f872
New label smoothing (#109)
* Modify label smoothing to match the one implemented in PyTorch.

* Enable CI for torch 1.10

* Fix CI errors.

* Fix CI installation errors.

* Fix CI installation errors.

* Minor fixes.

* Minor fixes.

* Minor fixes.

* Minor fixes.

* Minor fixes.

* Fix CI errors.
2021-11-17 19:24:07 +08:00
Mingshuang Luo
2e0f255ada
Add timit recipe (including the code scripts and the docs) for icefall (#114)
* add timit recipe for icefall

* add shared file

* update the docs for timit recipe

* Delete shared

* update the timit recipe and check style

* Update model.py

* Do some changes

* Update model.py

* Update model.py

* Add README.md and RESULTS.md

* Update RESULTS.md

* Update README.md

* update the docs for timit recipe
2021-11-17 11:23:45 +08:00
Fangjun Kuang
68506609ad
Set fsa.properties to None after changing its labels in-place. (#121) 2021-11-16 23:11:30 +08:00
Fangjun Kuang
8d679c3e74
Fix typos. (#115) 2021-11-10 14:45:30 +08:00
Fangjun Kuang
21096e99d8
Update result for the librispeech recipe using vocab size 500 and att rate 0.8 (#113)
* Update RESULTS using vocab size 500, att rate 0.8

* Update README.

* Refactoring.

Since FSAs in an Nbest object are linear in structure, we can
add the scores of a path to compute the total scores.

* Update documentation.

* Change default vocab size from 5000 to 500.
2021-11-10 14:32:52 +08:00
Fangjun Kuang
42b437bea6
Use pre-sorted text to generate token ids for attention decoder. (#98)
* Use pre-sorted text to generate token ids for attention decoder.

See https://github.com/k2-fsa/icefall/issues/97
for more details.

* Fix typos.
2021-10-29 13:46:41 +08:00
Fangjun Kuang
8cb7f712e4
Use GPU for averaging checkpoints if possible. (#84) 2021-10-26 17:10:04 +08:00
Fangjun Kuang
712ead8207
Fix an error when attention decoder rescoring returns None. (#90) 2021-10-22 19:52:25 +08:00
Piotr Żelasko
3cc99d2af2 make flake8 happy 2021-10-19 11:24:54 -04:00
Piotr Żelasko
86f3e0ef37 Make flake8 happy 2021-10-18 09:54:40 -04:00
Piotr Żelasko
6fbd7a287c Refactor OOM batch scanning into a local function 2021-10-18 09:53:04 -04:00
Piotr Żelasko
d509d58f30 Merge branch 'master' into feature/find-pessimistic-batches 2021-10-18 09:47:21 -04:00
Fangjun Kuang
3effcb4225
Fix typos. (#85) 2021-10-18 16:17:14 +08:00
Fangjun Kuang
53b79fafa7
Add MMI training with word pieces as modelling unit. (#6)
* Fix an error in TDNN-LSTM training.

* WIP: Refactoring

* Refactor transformer.py

* Remove unused code.

* Minor fixes.

* Fix decoder padding mask.

* Add MMI training with word pieces.

* Remove unused files.

* Minor fixes.

* Refactoring.

* Minor fixes.

* Use pre-computed alignments in LF-MMI training.

* Minor fixes.

* Update decoding script.

* Add doc about how to check and use extracted alignments.

* Fix style issues.

* Fix typos.

* Fix style issues.

* Disable macOS tests for now.
2021-10-18 15:20:32 +08:00
Fangjun Kuang
4890e27b45
Extract framewise alignment information using CTC decoding (#39)
* Use new APIs with k2.RaggedTensor

* Fix style issues.

* Update the installation doc, saying it requires at least k2 v1.7

* Extract framewise alignment information using CTC decoding.

* Print environment information.

Print information about k2, lhotse, PyTorch, and icefall.

* Fix CI.

* Fix CI.

* Compute framewise alignment information of the LibriSpeech dataset.

* Update comments for the time to compute alignments of train-960.

* Preserve cut id in mix cut transformer.

* Minor fixes.

* Add doc about how to extract framewise alignments.
2021-10-18 14:24:33 +08:00
Piotr Żelasko
403d1744ff Introduce backprop in finding OOM batches 2021-10-15 10:05:13 -04:00
Piotr Żelasko
060117a9ff Reformatting 2021-10-14 21:40:14 -04:00
Piotr Żelasko
1c7c79f2fc Find CUDA OOM batches before starting training 2021-10-14 21:28:11 -04:00
Fangjun Kuang
fee1f84b20
Test pre-trained model in CI (#80)
* Add CI to run pre-trained models.

* Minor fixes.

* Install kaldifeat

* Install a CPU version of PyTorch.

* Fix CI errors.

* Disable decoder layers in pretrained.py if it is not used.

* Clone pre-trained model from GitHub.

* Minor fixes.

* Minor fixes.

* Minor fixes.
2021-10-15 00:41:33 +08:00
Mingshuang Luo
5401ce199d
Update ctc-decoding on pretrained.py and conformer_ctc.rst (#78) 2021-10-14 23:29:06 +08:00
Fangjun Kuang
f2387fe523
Fix a bug introduced while supporting torch script. (#79) 2021-10-14 20:09:38 +08:00