190 Commits

Author SHA1 Message Date
Fangjun Kuang
516b4869b3
Add Matcha-TTS (#1773) 2024-10-29 15:04:04 +08:00
zr_jin
88bacfb9e6
minor fixes for the repo (#1775)
* minor fixes for the repo

Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
2024-10-21 13:51:56 +08:00
Zengwei Yao
693d84a301
Add Consistency-Regularized CTC (#1766)
* support consistency-regularized CTC

* update arguments of cr-ctc

* set default value of cr_loss_masked_scale to 1.0

* minor fix

* refactor codes

* update RESULTS.md
2024-10-21 10:35:26 +08:00
zzasdf
2653df5bda
fix the mismatch in batch_idx_train (#1757) 2024-10-12 19:14:28 +08:00
Fangjun Kuang
2e13298717
Refactor ctc greedy search. (#1691)
Use torch.unique_consecutive() to avoid reinventing the wheel.
2024-07-15 12:01:47 +08:00
Zengwei Yao
d47c078286
add decoding method of ctc-greedy-search in zipformer recipe (#1690) 2024-07-14 17:30:13 +08:00
Zengwei Yao
f76afff741
Support CTC/AED option for Zipformer recipe (#1389)
* add attention-decoder loss option for zipformer recipe

* add attention-decoder-rescoring

* update export.py and pretrained_ctc.py

* update RESULTS.md
2024-07-05 20:19:18 +08:00
Fangjun Kuang
13f55d0735
Add merge_tokens for ctc forced alignment (#1649) 2024-06-12 17:45:13 +08:00
Daniel Povey
4d5c1f2e60
Remove inf from stored stats (#1647) 2024-06-10 22:41:54 +08:00
zr_jin
42a97f6d7b
Update env.py (#1635) 2024-05-22 22:29:38 +08:00
Yifan Yang
4e97b19b63
Remove duplicate logging initialization logic in utils.py (#1617) 2024-05-06 13:00:27 +08:00
Zengwei Yao
c08fe48603
add force=True to logging.basicConfig (#1613) 2024-05-04 11:42:23 +08:00
Dongji Gao
9a17f4ce41
add OTC related scripts using phone as units instead of BPEs (#1602)
* add otc related scripts using phone instead of bpe
2024-04-26 00:55:44 +08:00
Yifan Yang
368b7d10a7
clear log handlers before setup (#1603) 2024-04-24 15:31:25 +09:00
zr_jin
d5cd78a637
Update hooks.py (#1564) 2024-03-20 16:43:45 +08:00
zr_jin
9bd30853ae
Update diagnostics.py (#1562) 2024-03-20 15:35:14 +08:00
zr_jin
413220d6a4
Minor fixes for the multi_zh_en recipe (#1526) 2024-03-18 20:25:57 +08:00
zr_jin
eb132da00d
additional instruction for the grad_scale is too small error (#1550) 2024-03-14 11:33:49 +08:00
zr_jin
242002e0bd
Strengthened style constraints (#1527) 2024-03-04 23:28:04 +08:00
Wei Kang
aac7df064a
Recipes for open vocabulary keyword spotting (#1428)
* English recipe on gigaspeech; Chinese recipe on wenetspeech
2024-02-22 15:31:20 +08:00
zr_jin
027302c902
minor fix for param. names (#1495) 2024-02-20 14:38:51 +08:00
safarisadegh
d9ae8c02a0
Update README.md (#1497) 2024-02-09 15:05:01 +08:00
Henry Li Xinyuan
b07d5472c5
Implement recipe for Fluent Speech Commands dataset (#1469)
---------

Signed-off-by: Xinyuan Li <xli257@c13.clsp.jhu.edu>
2024-01-31 22:53:36 +08:00
Yuekai Zhang
1c30847947
Whisper Fine-tuning Recipe on Aishell1 (#1466)
* add decode seamlessm4t

* add requirements

* add decoding with avg model

* add token files

* add custom tokenizer

* support deepspeed to finetune large model

* support large-v3

* add model saving

* using monkey patch to replace models

* add manifest dir option
2024-01-27 00:32:30 +08:00
Wei Kang
11d816d174
Add cumstomized score for hotwords (#1385)
* add custom score for each hotword

* Add more comments

* Fix deocde

* fix style

* minor fixes
2023-11-18 18:47:55 +08:00
zr_jin
6d275ddf9f
fixed broken softlinks (#1381)
* removed broken softlinks

* fixed dependencies

* fixed file permission
2023-11-10 14:45:16 +08:00
lishaojie
1b2e99d374
add the pruned_transducer_stateless7_streaming recipe for commonvoice (#1018)
* add the pruned_transducer_stateless7_streaming recipe for commonvoice

* fix the symlinks

* Update RESULTS.md
2023-11-09 22:07:28 +08:00
zr_jin
23913f6afd
Minor refinements for some stale but recently merged PRs (#1354)
* incorporate https://github.com/k2-fsa/icefall/pull/1269

* incorporate https://github.com/k2-fsa/icefall/pull/1301

* black formatted

* incorporate https://github.com/k2-fsa/icefall/pull/1162

* black formatted
2023-10-31 10:28:20 +08:00
Rudra
eef47adee9
fix typo (#1324) 2023-10-19 22:54:43 +08:00
Daniel Povey
973dc1026d
Make diagnostics.py more error-tolerant and have wider range of supported torch versions (#1234) 2023-10-19 22:54:00 +08:00
Karel Vesely
543b4cc1ca
small enhanecements (#1322)
- add extra check of 'x' and 'x_lens' to earlier point in Transducer model
- specify 'utf' encoding when opening text files for writing (recogs,
  errs)
2023-10-19 21:53:31 +08:00
Surav Shrestha
36c60b0cf6
fix typos in icefall/utils.py (#1319) 2023-10-19 11:15:18 +08:00
marcoyang1998
16a2748d6c
PromptASR for contextualized ASR with controllable style (#1250)
* Add PromptASR with BERT as text encoder

* Support using word-list based content prompts for context biasing

* Upload the pretrained models to huggingface

* Add usage example
2023-10-11 14:56:41 +08:00
Fangjun Kuang
f14b673408
Add HLG decoding with OpenFst on CPU for aishell conformer_ctc (#1279) 2023-10-01 13:46:16 +08:00
Dongji Gao
3abc290c11
Add scripts and recipe for BTC/OTC (#1255) 2023-09-29 07:52:46 +08:00
Fangjun Kuang
2318c3fbd0
Support CTC decoding on CPU using OpenFst and kaldi decoders. (#1244) 2023-09-26 16:36:19 +08:00
zr_jin
ef5da4824d
formatted the entire LibriSpeech recipe (#1270)
* formatted the entire librispeech recipe

* minor updates
2023-09-24 17:31:01 +08:00
zr_jin
3199058194
enable sclite_mode for swbd scoring (#1239) 2023-09-09 21:25:26 +08:00
Wei Kang
4d7f73ce65
Add context biasing for zipformer recipe (#1204)
* Add context biasing for zipformer recipe

* support context biasing in modified_beam_search_LODR

* fix context graph

* Minor fixes
2023-08-28 19:37:32 +08:00
zr_jin
a81396b482
Use tokens.txt to replace bpe.model (#1162) 2023-08-12 16:53:59 +08:00
Desh Raj
a4402b88e6
SURT multi-talker ASR recipe (#1126)
* merge upstream

* add SURT model and training

* add libricss decoding

* add chunk width randomization

* decode SURT with libricss

* initial commit for zipformer_ctc

* remove unwanted changes

* remove changes to other recipe

* fix zipformer softlink

* fix for JIT export

* add missing file

* fix symbolic links

* update results

* clean commit for SURT recipe

* training libricss surt model

* remove unwanted files

* remove unwanted changes

* remove changes in librispeech

* change some files to symlinks

* remove unwanted changes in utils

* add export script

* add README

* minor fix in README

* add assets for README

* replace some files with symlinks

* remove unused decoding methods

* fix symlink

* address comments from @csukuangfj
2023-07-04 19:25:58 +08:00
Nickolay V. Shmyrev
eca0202632
Add start-batch option for RNNLM training (#1161)
* Add start-batch option for RNNLM training

* Also set epoch

* Skip batches on load
2023-07-04 10:13:25 +08:00
Peter Ross
b4c38d7547
Use symlinks for best epochs (#1123)
* utils: add symlink_or_copyfile

* pruned_transducer_stateless7: use symlinks (when possible) to output best epochs

* Rename function

---------

Co-authored-by: Yifan Yang <64255737+yfyeung@users.noreply.github.com>
2023-06-12 13:51:46 +08:00
Wei Kang
ba257efbcd
Add Context biasing (#1038)
* Add context biasing for librispeech

* Add context biasing for wenetspeech

* fix bugs

* Implement Aho-Corasick context graph

* fix some bugs

* Fixes to forward_one_step; add draw to context graph

* add output arc; fix black

* Fix wenetspeech tokenizer

* Minor fixes to the decode.py
2023-06-03 21:28:49 +08:00
Zengwei Yao
7a604057f9
update diagnostics, print limits in Balancer, merge changes from Dan's branch zlm59 (#1109) 2023-06-01 14:24:19 +08:00
Zengwei Yao
6826b076d4
add flops profiler, support for Zipformer encoder and Conformer encoder (#1093)
* add flops profiler, support for Zipformer encoder and Conformer encoder

* support for reworked conformer and old zipformer

* skip black check
2023-05-24 19:10:45 +08:00
Fangjun Kuang
dbcf0b41db
Fix stateless7 training error (#1082) 2023-05-23 12:52:02 +08:00
Zengwei Yao
a7e142b7ff
Support long audios recognition (#980)
* support long file transcription

* rename recipe as long_file_recog

* add docs

* support multi-gpu decoding

* style fix
2023-05-19 20:27:55 +08:00
Zengwei Yao
f18b539fbc
Add the upgraded Zipformer model (#1058)
* add the zipformer codes, copied from branch from_dan_scaled_adam_exp1119

* support model export with torch.jit.script

* update RESULTS.md

* support exporting streaming model with torch.jit.script

* add results of streaming models, with some minor changes

* update README.md

* add CI test

* update k2 version in requirements-ci.txt

* update pyproject.toml
2023-05-19 16:47:59 +08:00
Wei Kang
bccd20d978
Traning with byte level BPE (TAL_CSASR) (#1033)
* Add byte level bpe tal_csasr recipe

* Minor fixes to decoding and exporting

* Fix prepare.sh

* Update results
2023-05-16 12:44:52 +08:00