1216 Commits

Author SHA1 Message Date
Mingshuang Luo
39bc8cae94
Add ctc decoding to pretrained.py on conformer_ctc (#75)
* Add ctc-decoding to pretrained.py

* update pretrained.py and conformer_ctc.rst

* update ctc-decoding for pretrained.py on conformer_ctc

* Update pretrained.py

* fix the style issue

* Update conformer_ctc.rst

* Update the running logs
2021-10-13 12:20:16 +08:00
Mingshuang Luo
391432b356
Update train.py ("10"--->"params.log_interval") (#76)
* Update train.py

* Update train.py

* Update train.py
2021-10-12 21:30:31 +08:00
Mingshuang Luo
597c5efdb1
Use LossRecord to record and print the loss for the training process (#62)
* Update index.rst (AS->ASR)

* Update conformer_ctc.rst (pretraind->pretrained)

* Fix some spelling errors.

* Fix some spelling errors.

* Use LossRecord to record and print loss in the training process

* Change the name "LossRecord" to "MetricsTracker"
2021-10-12 15:58:03 +08:00
Fangjun Kuang
beb54ddb61
Support torch script. (#65)
* WIP: Support torchscript.

* Minor fixes.

* Fix style issues.

* Add documentation about how to deploy a trained model.
2021-10-12 14:55:05 +08:00
Piotr Żelasko
d54828e73a
Merge pull request #73 from pzelasko/feature/bucketing-in-test
Use BucketingSampler for dev and test data
2021-10-09 10:58:29 -04:00
Piotr Żelasko
069ebaf9ba Reformatting 2021-10-09 14:45:46 +00:00
Mingshuang Luo
6e43905d12
Update the documentation to include "ctc-decoding" (#71)
* Update conformer_ctc.rst
2021-10-09 11:56:25 +08:00
Piotr Żelasko
b682467e4d Use BucketingSampler for dev and test data 2021-10-08 22:32:13 -04:00
Piotr Żelasko
adb068eb82
setup.py (#64) 2021-10-01 16:43:08 +08:00
Fangjun Kuang
707d7017a7
Support pure ctc decoding requiring neither a lexicon nor an n-gram LM (#58)
* Rename lattice_score_scale to nbest_scale.

* Support pure CTC decoding requiring neither a lexicion nor an n-gram LM.

* Fix style issues.

* Fix a typo.

* Minor fixes.
2021-09-26 14:21:49 +08:00
Fangjun Kuang
455693aede
Fix hasattr of AttributeDict. (#52) 2021-09-22 16:37:20 +08:00
Fangjun Kuang
a80e58e15d
Refactor decode.py to make it more readable and more modular. (#44)
* Refactor decode.py to make it more readable and more modular.

* Fix an error.

Nbest.fsa should always have token IDs as labels and
word IDs as aux_labels.

* Add nbest decoding.

* Compute edit distance with k2.

* Refactor nbest-oracle.

* Add rescore with nbest lists.

* Add whole-lattice rescoring.

* Add rescoring with attention decoder.

* Refactoring.

* Fixes after refactoring.

* Fix a typo.

* Minor fixes.

* Replace [] with () for shapes.

* Use k2 v1.9

* Use Levenshtein graphs/alignment from k2 v1.9

* [doc] Require k2 >= v1.9

* Minor fixes.
2021-09-20 15:44:54 +08:00
Fangjun Kuang
cc77cb3459
Fix decode.py to remove the correct axis. (#50)
* Fix decode.py to remove the correct axis.

* Run GitHub actions manually.
2021-09-17 16:49:03 +08:00
Wei Kang
9a6e0489c8
update api for RaggedTensor (#45)
* Fix code style

* update k2 version in CI

* fix compile hlg
2021-09-14 16:39:56 +08:00
Fangjun Kuang
a2be2896a9
Fix the link to k2's installation doc. (#46) 2021-09-14 13:39:52 +08:00
Wei Kang
24656e9749
Update docs and remove unnecessary arguments (#42)
* Fix typo in docs

* Update docs and remove unnecessary arguments

* Fix code style
2021-09-13 18:28:57 +08:00
Fangjun Kuang
f792b466bf
Change default value of lattice-score-scale from 1.0 to 0.5 (#41)
* Change the default value of lattice-score-scale from 1.0 to 0.5

* Fix CI.
2021-09-13 10:49:18 +08:00
Fangjun Kuang
7f8e3a673a
Add commands for reproducing. (#40)
* Add commands for reproducing.

* Use --bucketing-sampler by default.
2021-09-09 13:50:31 +08:00
Fangjun Kuang
abadc71415
Use new APIs with k2.RaggedTensor (#38)
* Use new APIs with k2.RaggedTensor

* Fix style issues.

* Update the installation doc, saying it requires at least k2 v1.7

* Use k2 v1.7
2021-09-08 14:55:30 +08:00
Fangjun Kuang
331e5eb7ab
[doc] Fix typos. (#31) 2021-09-02 07:12:37 +08:00
Mingshuang Luo
5baa6a9f1c
fix a spelling mistake (tourch->touch) (#29) v1.0 2021-08-25 21:41:46 +08:00
Mingshuang Luo
eed3fc5610
Correct some spelling mistakes (#28)
* Update index.rst (AS->ASR)

* Update conformer_ctc.rst (pretraind->pretrained)
2021-08-25 17:48:34 +08:00
Fangjun Kuang
184dbb3ea5
Add documentation about code style and creating new recipes. (#27) 2021-08-25 14:48:41 +08:00
Fangjun Kuang
96e7f5c7ea
Release v0.1 (#26) v0.1 2021-08-24 21:30:30 +08:00
pkufool
f4223ee110
Add TDNN-LSTM-CTC Results (#25)
* Add tdnn-lstm pretrained model and results

* Add docs for TDNN-LSTM-CTC

* Minor fix

* Fix typo

* Fix style checking
2021-08-24 21:09:27 +08:00
Fangjun Kuang
1bd5dcc8ac
WIP: Add doc for the LibriSpeech recipe. (#24)
* WIP: Add doc for the LibriSpeech recipe.

* Add more doc for LibriSpeech recipe.

* Add more doc for the LibriSpeech recipe.

* More doc.
2021-08-24 20:28:32 +08:00
Fangjun Kuang
01da00dca0
WIP: Add documentation. (#22)
* Begin to add documentation.

* WIP: Add documentation.

* Fix a typo.

* Add more doc for the recipe yesno.

* Add more doc for the yesno recipe.
2021-08-24 14:28:08 +08:00
Fangjun Kuang
57cb611665
[yesno] Remove padding in TDNN (#21)
* Disable SpecAug for yesno.

Also replace Adam with SGD.

* Remove padding in the model to make the results reproducible.
2021-08-23 15:59:36 +08:00
Fangjun Kuang
6c2c9b9d74
Add recipe for the yes_no dataset. (#16)
* Add recipe for the yes_no dataset.

* Refactoring: Remove unused code.

* Add Colab notebook for the yesno dataset.

* Add GitHub actions to run yesno.

* Fix a typo.

* Minor fixes.

* Train more epochs for GitHub actions.

* Minor fixes.

* Minor fixes.

* Fix style issues.
2021-08-23 11:36:29 +08:00
pkufool
19c4214958
Fix code style and add copyright. (#18)
* Fix style and add copyright

* Minor fix

* Remove duplicate lines

* Reformat conformer.py by black

* Reformat code style with black.

* Fix github workflows

* Fix lhotse installation

* Install icefall requirements

* Update k2 version, remove lhotse from test workflow
2021-08-23 10:43:59 +08:00
Fangjun Kuang
8469f9ae0a
Refactor asr_datamodule. (#15)
* WIP: Refactor asr_datamodule.

* Fixes after review.

* Minor fixes.
2021-08-21 09:53:46 +08:00
Fangjun Kuang
0b656e4e1c
Add a link to Colab. (#14)
It demonstrates the usages of pre-trained models.
2021-08-20 15:43:25 +08:00
Fangjun Kuang
9d0cc9d829
Support computing nbest oracle WER. (#10)
* Support computing nbest oracle WER.

* Add scale to all nbest based decoding/rescoring methods.

* Add script to run pretrained models.

* Use torchaudio to extract features.

* Support decoding multiple files at the same time.

Also, use kaldifeat for feature extraction.

* Support decoding with LM rescoring and attention-decoder rescoring.

* Minor fixes.

* Replace scale with lattice-score-scale.

* Add usage example with a provided pretrained model.
2021-08-20 11:53:37 +08:00
pkufool
ef233486ae
The training script produce WER of 2.57% on librispeech test-clean (#13)
* Add grad_clip and weight-decay, small fix of dataloader and masking

* Add RESULTS.md
2021-08-20 10:08:08 +08:00
Fangjun Kuang
caa0b9e942
Fix an error in displaying decoding process. (#12) 2021-08-19 14:54:01 +08:00
Fangjun Kuang
1c3b13c7eb
Minor fixes. (#9) 2021-08-16 19:01:25 +08:00
Fangjun Kuang
12a2fd023e
Add doc about installation and usage (#7)
* Add readme.

* Add TOC.

* fix typos

* Minor fixes after review.
2021-08-12 12:44:04 +08:00
Fangjun Kuang
5a0b9bcb23
Refactoring (#4)
* Fix an error in TDNN-LSTM training.

* WIP: Refactoring

* Refactor transformer.py

* Remove unused code.

* Minor fixes.
2021-08-04 14:53:02 +08:00
Daniel Povey
cf8d76293d
Merge pull request #3 from csukuangfj/style-check
Add CTC training
2021-07-31 15:36:00 +08:00
Fangjun Kuang
398ed80d7a Minor fixes to support DDP training. 2021-07-31 15:26:57 +08:00
Fangjun Kuang
b94d97da37 Disable gradient computation in evaluation mode. 2021-07-29 20:37:31 +08:00
Fangjun Kuang
acc63a9172 WIP: Add BPE training code. 2021-07-29 20:23:52 +08:00
Fangjun Kuang
bd69e4be32 Use attention decoder for rescoring. 2021-07-28 12:22:09 +08:00
Fangjun Kuang
f65854cca5 Add BPE decoding results. 2021-07-27 17:38:47 +08:00
Fangjun Kuang
4ccae509d3 WIP: Begin to add BPE decoding 2021-07-26 20:06:58 +08:00
Fangjun Kuang
d3101fb005 Fix loading checkpoint in DDP training. 2021-07-26 08:08:14 +08:00
Fangjun Kuang
78bb65ed78 Fix an error in DDP training. 2021-07-25 22:33:09 +08:00
Fangjun Kuang
8055bf31a0 Support DDP training. 2021-07-25 21:40:09 +08:00
Fangjun Kuang
4a66712406 Add LM rescoring. 2021-07-25 18:21:26 +08:00
Fangjun Kuang
6f9fe5b906 Refactor decoding code. 2021-07-24 22:23:50 +08:00