56 Commits

Author SHA1 Message Date
Mahsa Yarmohammadi
021e1a8846
Add acknowledgment to README (#1950) 2025-05-22 22:06:35 +08:00
zr_jin
66225fbe33
VITS recipe for LibriTTS corpus (#1776) 2024-11-01 15:33:13 +08:00
Yifan Yang
37a1420603
remove incomplete recipe (#1778)
Co-authored-by: yifanyeung <v-yifanyang@microsoft.com>
2024-10-24 13:16:18 +08:00
Yifan Yang
6ac3343ce5
fix path in README.md (#1722) 2024-08-16 20:13:02 +08:00
Fangjun Kuang
3059eb4511
Fix doc URLs (#1660) 2024-06-21 11:10:14 +08:00
zr_jin
a813186f64
minor fix for docstr and default param. (#1490)
* Update train.py and README.md
2024-02-05 12:47:52 +08:00
zr_jin
9c494a3329
typos fixed (#1472) 2024-01-25 18:41:43 +08:00
Yifan Yang
559ed150bb
Fix typo (#1471) 2024-01-23 22:51:09 +08:00
zr_jin
ebe97a07b0
Reworked README.md (#1470)
* Rework README.md

Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>

---------

Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
2024-01-23 16:26:24 +08:00
Shreyas0410
5cebecf2dc
updated broken link in read.me file (#1342) 2023-10-27 13:36:15 +08:00
Zengwei Yao
c0a53271e2
Update Zipformer-large result on LibriSpeech (#1343)
* update zipformer-large result on librispeech
2023-10-26 17:35:12 +08:00
Yifan Yang
416852e8a1
Add Zipformer recipe for GigaSpeech (#1254)
Co-authored-by: Yifan Yang <yifanyeung@qq.com>
Co-authored-by: yfy62 <yfy62@d3-hpc-sjtu-test-005.cm.cluster>
2023-10-21 15:36:59 +08:00
Zengwei Yao
9af144c26b
Zipformer update result (#1296)
* update Zipformer results
2023-10-09 23:15:22 +08:00
zr_jin
ce08230ade
Update README.md (#1293) 2023-10-07 11:57:30 +08:00
Ikko Eltociear Ashimine
0c564c6c81
Fix typo in README.md (#1257) 2023-09-17 12:25:37 +08:00
Zengwei Yao
f18b539fbc
Add the upgraded Zipformer model (#1058)
* add the zipformer codes, copied from branch from_dan_scaled_adam_exp1119

* support model export with torch.jit.script

* update RESULTS.md

* support exporting streaming model with torch.jit.script

* add results of streaming models, with some minor changes

* update README.md

* add CI test

* update k2 version in requirements-ci.txt

* update pyproject.toml
2023-05-19 16:47:59 +08:00
Yifan Yang
24b50a5bad
Update README.md (#1043)
* Update README.md
2023-05-08 16:59:05 +08:00
zr_jin
04671b44f8
Update README.md (#649) 2022-11-02 23:36:40 +08:00
Mingshuang Luo
1b478d3ac3
Add other decoding methods (nbest, nbest oracle, nbest LG) for wenetspeech pruned rnnt2 (#482)
* add other decoding methods for wenetspeech

* changes for RESULTS.md

* add ngram-lm-scale=0.35 results

* set ngram-lm-scale=0.35 as default

* Update README.md

* add nbest-scale for flie name
2022-07-29 12:03:08 +08:00
Mingshuang Luo
f26b62ac00
[WIP] Pruned-transducer-stateless5-for-WenetSpeech (offline and streaming) (#447)
* pruned-rnnt5-for-wenetspeech

* style check

* style check

* add streaming conformer

* add streaming decode

* changes codes for fast_beam_search and export cpu jit

* add modified-beam-search for streaming decoding

* add modified-beam-search for streaming decoding

* change for streaming_beam_search.py

* add README.md and RESULTS.md

* change for style_check.yml

* do some changes

* do some changes for export.py

* add some decode commands for usage

* add streaming results on README.md
2022-07-28 12:54:27 +08:00
Fangjun Kuang
d99796898c
Update doc to add a link to Nadira Povey's YouTube channel. (#492)
* Update doc to add a link to Nadira Povey's YouTube channel.

* fix a typo
2022-07-25 12:06:40 +08:00
Mingshuang Luo
2cb1618c95
[Ready to merge] Pruned transducer stateless5 recipe for tal_csasr (mix Chinese chars and English BPE) (#428)
* add pruned transducer stateless5 recipe for tal_csasr

* do some changes for merging

* change for conformer.py

* add wer and cer for Chinese and English respectively

* fix a error for conformer.py
2022-06-28 11:02:10 +08:00
Mingshuang Luo
5c3ee8bfcd
[Ready to merge] Pruned transducer stateless5 recipe for AISHELL4 (#399)
* pruned-transducer-stateless5 recipe for aishell4

* pruned-transducer-stateless5 recipe for aishell4

* do some changes and text normalize

* do some changes

* add text normalize

* combine the training data and decode without webdataset

* update codes for merging

* Do a change for READMD.md
2022-06-14 22:19:05 +08:00
Fangjun Kuang
9f6c748b30
Add links to sherpa. (#417)
* Add links to sherpa.
2022-06-10 12:19:18 +08:00
Mingshuang Luo
beab229fd7
[Ready to merge] Pruned_transducer_stateless2 for alimeeting dataset (#378)
* add pruned-rnnt2 recipe for alimeeting dataset

* update code for merging

* change LilcomHdf5Writer to ChunkedLilcomHdf5Writer

* change for test.yml

* change for test.yml

* change for test.yml

* change for workflow yml

* change for yml

* change for yml

* change for README.md

* change for yml

* solve the conflicts

* solve the conflicts
2022-06-04 13:47:46 +08:00
Mingshuang Luo
0e57b30495
[Ready to merge] Pruned Transducer Stateless2 for WenetSpeech (char-based) (#349)
* add char-based pruned-rnnt2 for wenetspeech

* style check

* style check

* change for export.py

* do some changes

* do some changes

* a small change for .flake8

* solve the conflicts
2022-05-23 17:13:01 +08:00
Guanbo Wang
9630f9a3ba
Update GigaSpeech reults (#364)
* Update decode.py

* Update export.py

* Update results

* Update README.md
2022-05-15 12:57:40 +08:00
Fangjun Kuang
f23dd43719
Update results for libri+giga multi dataset setup. (#363)
* Update results for libri+giga multi dataset setup.
2022-05-14 21:45:39 +08:00
Fangjun Kuang
e30e042c39
Update decoding script for gigaspeech and remove duplicate files. (#361) 2022-05-13 13:03:16 +08:00
Fangjun Kuang
6dc2e04462
Update results. (#340)
* Update results.

* Typo fixes.
2022-04-29 15:49:45 +08:00
Fangjun Kuang
ac84220de9
Modified conformer with multi datasets (#312)
* Copy files for editing.

* Use librispeech + gigaspeech with modified conformer.

* Support specifying number of workers for on-the-fly feature extraction.

* Feature extraction code for GigaSpeech.

* Combine XL splits lazily during training.

* Fix warnings in decoding.

* Add decoding code for GigaSpeech.

* Fix decoding the gigaspeech dataset.

We have to use the decoder/joiner networks for the GigaSpeech dataset.

* Disable speed perturbe for XL subset.

* Compute the Nbest oracle WER for RNN-T decoding.

* Minor fixes.

* Minor fixes.

* Add results.

* Update results.

* Update CI.

* Update results.

* Fix style issues.

* Update results.

* Fix style issues.
2022-04-29 15:40:30 +08:00
Mingshuang Luo
118e195004
Update results for tedlium3 pruned RNN-T (#307)
* Update README.md
2022-04-11 22:19:26 +08:00
Mingshuang Luo
8cb727e24a
Tedlium3 pruned transducer stateless (#261)
* update tedlium3-pruned-transducer-stateless-codes

* update README.md

* update README.md

* add fast beam search for decoding

* do a change for RESULTS.md

* do a change for RESULTS.md

* do a fix

* do some changes for pruned RNN-T
2022-04-11 17:08:53 +08:00
Fangjun Kuang
bb7f6ed6b7
Add modified beam search for pruned rnn-t. (#248)
* Add modified beam search for pruned rnn-t.

* Fix style issues.

* Update RESULTS.md.

* Fix typos.

* Minor fixes.

* Test the pre-trained model using GitHub actions.

* Let the user install optimized_transducer on her own.

* Fix errors in GitHub CI.
2022-03-12 16:16:55 +08:00
Fangjun Kuang
50d2281524
Add modified transducer loss for AIShell dataset (#219)
* Add modified transducer for aishell.

* Minor fixes.

* Add extra data in transducer training.

The extra data is from http://www.openslr.org/62/

* Update export.py and pretrained.py

* Update CI to install pretrained models with aishell.

* Update results.

* Update results.

* Update README.

* Use symlinks to avoid copies.
2022-03-02 16:02:38 +08:00
Fangjun Kuang
05cb297858
Update result for full libri + GigaSpeech using transducer_stateless. (#231) 2022-03-01 17:01:46 +08:00
Fangjun Kuang
72f838dee1
Update results for transducer_stateless after training for more epochs. (#207) 2022-03-01 16:35:02 +08:00
PF Luo
ac7c2d84bc
minor fix for aishell recipe (#223)
* just remove unnecessary torch.sum

* minor fixs for aishell
2022-02-23 08:33:20 +08:00
PF Luo
277cc3f9bf
update aishell-1 recipe with k2.rnnt_loss (#215)
* update aishell-1 recipe with k2.rnnt_loss

* fix flak8 style

* typo

* add pretrained model link to result.md
2022-02-19 15:56:39 +08:00
Fangjun Kuang
a8150021e0
Use modified transducer loss in training. (#179)
* Use modified transducer loss in training.

* Minor fix.

* Add modified beam search.

* Add modified beam search.

* Minor fixes.

* Fix typo.

* Update RESULTS.

* Fix a typo.

* Minor fixes.
2022-02-07 18:37:36 +08:00
Fangjun Kuang
f94ff19bfe
Refactor beam search and update results. (#177) 2022-01-18 16:40:19 +08:00
Fangjun Kuang
4c1b3665ee
Use optimized_transducer to compute transducer loss. (#162)
* WIP: Use optimized_transducer to compute transducer loss.

* Minor fixes.

* Fix decoding.

* Fix decoding.

* Add RESULTS.

* Update RESULTS.

* Update CI.

* Fix sampling rate for yesno recipe.
2022-01-10 11:54:58 +08:00
pingfengluo
ea8af0ee9a
add transducer_stateless with char unit to AIShell (#164) 2022-01-01 18:32:08 +08:00
Fangjun Kuang
14c93add50
Remove batchnorm, weight decay, and SOS from transducer conformer encoder (#155)
* Remove batchnorm, weight decay, and SOS.

* Make --context-size configurable.

* Update results.
2021-12-27 16:01:10 +08:00
Fangjun Kuang
5b6699a835
Minor fixes to the RNN-T Conformer model (#152)
* Disable weight decay.

* Remove input feature batchnorm..

* Replace BatchNorm in the Conformer model with LayerNorm.

* Use tanh in the joint network.

* Remove sos ID.

* Reduce the number of decoder layers from 4 to 2.

* Minor fixes.

* Fix typos.
2021-12-23 13:54:25 +08:00
Fangjun Kuang
fb6a57e9e0
Increase the size of the context in the RNN-T decoder. (#153) 2021-12-23 07:55:02 +08:00
Fangjun Kuang
1d44da845b
RNN-T Conformer training for LibriSpeech (#143)
* Begin to add RNN-T training for librispeech.

* Copy files from conformer_ctc.

Will edit it.

* Use conformer/transformer model as encoder.

* Begin to add training script.

* Add training code.

* Remove long utterances to avoid OOM when a large max_duraiton is used.

* Begin to add decoding script.

* Add decoding script.

* Minor fixes.

* Add beam search.

* Use LSTM layers for the encoder.

Need more tunings.

* Use stateless decoder.

* Minor fixes to make it ready for merge.

* Fix README.

* Update RESULT.md to include RNN-T Conformer.

* Minor fixes.

* Fix tests.

* Minor fixes.

* Minor fixes.

* Fix tests.
2021-12-18 07:42:51 +08:00
Wei Kang
4151cca147
Add torch script support for Aishell and update documents (#124)
* Add aishell recipe

* Remove unnecessary code and update docs

* adapt to k2 v1.7, add docs and results

* Update conformer ctc model

* Update docs, pretrained.py & results

* Fix code style

* Fix code style

* Fix code style

* Minor fix

* Minor fix

* Fix pretrained.py

* Update pretrained model & corresponding docs

* Export torch script model for Aishell

* Add C++ deployment docs

* Minor fixes

* Fix unit test

* Update Readme
2021-11-19 16:37:05 +08:00
Mingshuang Luo
2e0f255ada
Add timit recipe (including the code scripts and the docs) for icefall (#114)
* add timit recipe for icefall

* add shared file

* update the docs for timit recipe

* Delete shared

* update the timit recipe and check style

* Update model.py

* Do some changes

* Update model.py

* Update model.py

* Add README.md and RESULTS.md

* Update RESULTS.md

* Update README.md

* update the docs for timit recipe
2021-11-17 11:23:45 +08:00
Fangjun Kuang
21096e99d8
Update result for the librispeech recipe using vocab size 500 and att rate 0.8 (#113)
* Update RESULTS using vocab size 500, att rate 0.8

* Update README.

* Refactoring.

Since FSAs in an Nbest object are linear in structure, we can
add the scores of a path to compute the total scores.

* Update documentation.

* Change default vocab size from 5000 to 500.
2021-11-10 14:32:52 +08:00