17 Commits

Author SHA1 Message Date
Fangjun Kuang
707d7017a7
Support pure ctc decoding requiring neither a lexicon nor an n-gram LM (#58)
* Rename lattice_score_scale to nbest_scale.

* Support pure CTC decoding requiring neither a lexicion nor an n-gram LM.

* Fix style issues.

* Fix a typo.

* Minor fixes.
2021-09-26 14:21:49 +08:00
Fangjun Kuang
a80e58e15d
Refactor decode.py to make it more readable and more modular. (#44)
* Refactor decode.py to make it more readable and more modular.

* Fix an error.

Nbest.fsa should always have token IDs as labels and
word IDs as aux_labels.

* Add nbest decoding.

* Compute edit distance with k2.

* Refactor nbest-oracle.

* Add rescore with nbest lists.

* Add whole-lattice rescoring.

* Add rescoring with attention decoder.

* Refactoring.

* Fixes after refactoring.

* Fix a typo.

* Minor fixes.

* Replace [] with () for shapes.

* Use k2 v1.9

* Use Levenshtein graphs/alignment from k2 v1.9

* [doc] Require k2 >= v1.9

* Minor fixes.
2021-09-20 15:44:54 +08:00
Wei Kang
24656e9749
Update docs and remove unnecessary arguments (#42)
* Fix typo in docs

* Update docs and remove unnecessary arguments

* Fix code style
2021-09-13 18:28:57 +08:00
Fangjun Kuang
f792b466bf
Change default value of lattice-score-scale from 1.0 to 0.5 (#41)
* Change the default value of lattice-score-scale from 1.0 to 0.5

* Fix CI.
2021-09-13 10:49:18 +08:00
Fangjun Kuang
abadc71415
Use new APIs with k2.RaggedTensor (#38)
* Use new APIs with k2.RaggedTensor

* Fix style issues.

* Update the installation doc, saying it requires at least k2 v1.7

* Use k2 v1.7
2021-09-08 14:55:30 +08:00
Fangjun Kuang
184dbb3ea5
Add documentation about code style and creating new recipes. (#27) 2021-08-25 14:48:41 +08:00
pkufool
f4223ee110
Add TDNN-LSTM-CTC Results (#25)
* Add tdnn-lstm pretrained model and results

* Add docs for TDNN-LSTM-CTC

* Minor fix

* Fix typo

* Fix style checking
2021-08-24 21:09:27 +08:00
Fangjun Kuang
6c2c9b9d74
Add recipe for the yes_no dataset. (#16)
* Add recipe for the yes_no dataset.

* Refactoring: Remove unused code.

* Add Colab notebook for the yesno dataset.

* Add GitHub actions to run yesno.

* Fix a typo.

* Minor fixes.

* Train more epochs for GitHub actions.

* Minor fixes.

* Minor fixes.

* Fix style issues.
2021-08-23 11:36:29 +08:00
pkufool
19c4214958
Fix code style and add copyright. (#18)
* Fix style and add copyright

* Minor fix

* Remove duplicate lines

* Reformat conformer.py by black

* Reformat code style with black.

* Fix github workflows

* Fix lhotse installation

* Install icefall requirements

* Update k2 version, remove lhotse from test workflow
2021-08-23 10:43:59 +08:00
Fangjun Kuang
8469f9ae0a
Refactor asr_datamodule. (#15)
* WIP: Refactor asr_datamodule.

* Fixes after review.

* Minor fixes.
2021-08-21 09:53:46 +08:00
Fangjun Kuang
caa0b9e942
Fix an error in displaying decoding process. (#12) 2021-08-19 14:54:01 +08:00
Fangjun Kuang
5a0b9bcb23
Refactoring (#4)
* Fix an error in TDNN-LSTM training.

* WIP: Refactoring

* Refactor transformer.py

* Remove unused code.

* Minor fixes.
2021-08-04 14:53:02 +08:00
Fangjun Kuang
acc63a9172 WIP: Add BPE training code. 2021-07-29 20:23:52 +08:00
Fangjun Kuang
f65854cca5 Add BPE decoding results. 2021-07-27 17:38:47 +08:00
Fangjun Kuang
4a66712406 Add LM rescoring. 2021-07-25 18:21:26 +08:00
Fangjun Kuang
6f9fe5b906 Refactor decoding code. 2021-07-24 22:23:50 +08:00
Fangjun Kuang
f3542c7793 Add CTC training. 2021-07-24 17:13:20 +08:00