Fangjun Kuang
516b4869b3
Add Matcha-TTS ( #1773 )
2024-10-29 15:04:04 +08:00
zr_jin
88bacfb9e6
minor fixes for the repo ( #1775 )
...
* minor fixes for the repo
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
2024-10-21 13:51:56 +08:00
Zengwei Yao
693d84a301
Add Consistency-Regularized CTC ( #1766 )
...
* support consistency-regularized CTC
* update arguments of cr-ctc
* set default value of cr_loss_masked_scale to 1.0
* minor fix
* refactor codes
* update RESULTS.md
2024-10-21 10:35:26 +08:00
zzasdf
2653df5bda
fix the mismatch in batch_idx_train ( #1757 )
2024-10-12 19:14:28 +08:00
Fangjun Kuang
2e13298717
Refactor ctc greedy search. ( #1691 )
...
Use torch.unique_consecutive() to avoid reinventing the wheel.
2024-07-15 12:01:47 +08:00
Zengwei Yao
d47c078286
add decoding method of ctc-greedy-search in zipformer recipe ( #1690 )
2024-07-14 17:30:13 +08:00
Zengwei Yao
f76afff741
Support CTC/AED option for Zipformer recipe ( #1389 )
...
* add attention-decoder loss option for zipformer recipe
* add attention-decoder-rescoring
* update export.py and pretrained_ctc.py
* update RESULTS.md
2024-07-05 20:19:18 +08:00
Fangjun Kuang
13f55d0735
Add merge_tokens for ctc forced alignment ( #1649 )
2024-06-12 17:45:13 +08:00
Daniel Povey
4d5c1f2e60
Remove inf from stored stats ( #1647 )
2024-06-10 22:41:54 +08:00
zr_jin
42a97f6d7b
Update env.py ( #1635 )
2024-05-22 22:29:38 +08:00
Yifan Yang
4e97b19b63
Remove duplicate logging initialization logic in utils.py ( #1617 )
2024-05-06 13:00:27 +08:00
Zengwei Yao
c08fe48603
add force=True to logging.basicConfig ( #1613 )
2024-05-04 11:42:23 +08:00
Dongji Gao
9a17f4ce41
add OTC related scripts using phone as units instead of BPEs ( #1602 )
...
* add otc related scripts using phone instead of bpe
2024-04-26 00:55:44 +08:00
Yifan Yang
368b7d10a7
clear log handlers before setup ( #1603 )
2024-04-24 15:31:25 +09:00
zr_jin
d5cd78a637
Update hooks.py ( #1564 )
2024-03-20 16:43:45 +08:00
zr_jin
9bd30853ae
Update diagnostics.py ( #1562 )
2024-03-20 15:35:14 +08:00
zr_jin
413220d6a4
Minor fixes for the multi_zh_en
recipe ( #1526 )
2024-03-18 20:25:57 +08:00
zr_jin
eb132da00d
additional instruction for the grad_scale is too small
error ( #1550 )
2024-03-14 11:33:49 +08:00
zr_jin
242002e0bd
Strengthened style constraints ( #1527 )
2024-03-04 23:28:04 +08:00
Wei Kang
aac7df064a
Recipes for open vocabulary keyword spotting ( #1428 )
...
* English recipe on gigaspeech; Chinese recipe on wenetspeech
2024-02-22 15:31:20 +08:00
zr_jin
027302c902
minor fix for param. names ( #1495 )
2024-02-20 14:38:51 +08:00
safarisadegh
d9ae8c02a0
Update README.md ( #1497 )
2024-02-09 15:05:01 +08:00
Henry Li Xinyuan
b07d5472c5
Implement recipe for Fluent Speech Commands dataset ( #1469 )
...
---------
Signed-off-by: Xinyuan Li <xli257@c13.clsp.jhu.edu>
2024-01-31 22:53:36 +08:00
Yuekai Zhang
1c30847947
Whisper Fine-tuning Recipe on Aishell1 ( #1466 )
...
* add decode seamlessm4t
* add requirements
* add decoding with avg model
* add token files
* add custom tokenizer
* support deepspeed to finetune large model
* support large-v3
* add model saving
* using monkey patch to replace models
* add manifest dir option
2024-01-27 00:32:30 +08:00
Wei Kang
11d816d174
Add cumstomized score for hotwords ( #1385 )
...
* add custom score for each hotword
* Add more comments
* Fix deocde
* fix style
* minor fixes
2023-11-18 18:47:55 +08:00
zr_jin
6d275ddf9f
fixed broken softlinks ( #1381 )
...
* removed broken softlinks
* fixed dependencies
* fixed file permission
2023-11-10 14:45:16 +08:00
lishaojie
1b2e99d374
add the pruned_transducer_stateless7_streaming recipe for commonvoice ( #1018 )
...
* add the pruned_transducer_stateless7_streaming recipe for commonvoice
* fix the symlinks
* Update RESULTS.md
2023-11-09 22:07:28 +08:00
zr_jin
23913f6afd
Minor refinements for some stale but recently merged PRs ( #1354 )
...
* incorporate https://github.com/k2-fsa/icefall/pull/1269
* incorporate https://github.com/k2-fsa/icefall/pull/1301
* black formatted
* incorporate https://github.com/k2-fsa/icefall/pull/1162
* black formatted
2023-10-31 10:28:20 +08:00
Rudra
eef47adee9
fix typo ( #1324 )
2023-10-19 22:54:43 +08:00
Daniel Povey
973dc1026d
Make diagnostics.py more error-tolerant and have wider range of supported torch versions ( #1234 )
2023-10-19 22:54:00 +08:00
Karel Vesely
543b4cc1ca
small enhanecements ( #1322 )
...
- add extra check of 'x' and 'x_lens' to earlier point in Transducer model
- specify 'utf' encoding when opening text files for writing (recogs,
errs)
2023-10-19 21:53:31 +08:00
Surav Shrestha
36c60b0cf6
fix typos in icefall/utils.py ( #1319 )
2023-10-19 11:15:18 +08:00
marcoyang1998
16a2748d6c
PromptASR for contextualized ASR with controllable style ( #1250 )
...
* Add PromptASR with BERT as text encoder
* Support using word-list based content prompts for context biasing
* Upload the pretrained models to huggingface
* Add usage example
2023-10-11 14:56:41 +08:00
Fangjun Kuang
f14b673408
Add HLG decoding with OpenFst on CPU for aishell conformer_ctc ( #1279 )
2023-10-01 13:46:16 +08:00
Dongji Gao
3abc290c11
Add scripts and recipe for BTC/OTC ( #1255 )
2023-09-29 07:52:46 +08:00
Fangjun Kuang
2318c3fbd0
Support CTC decoding on CPU using OpenFst and kaldi decoders. ( #1244 )
2023-09-26 16:36:19 +08:00
zr_jin
ef5da4824d
formatted the entire LibriSpeech recipe ( #1270 )
...
* formatted the entire librispeech recipe
* minor updates
2023-09-24 17:31:01 +08:00
zr_jin
3199058194
enable sclite_mode
for swbd scoring ( #1239 )
2023-09-09 21:25:26 +08:00
Wei Kang
4d7f73ce65
Add context biasing for zipformer recipe ( #1204 )
...
* Add context biasing for zipformer recipe
* support context biasing in modified_beam_search_LODR
* fix context graph
* Minor fixes
2023-08-28 19:37:32 +08:00
zr_jin
a81396b482
Use tokens.txt to replace bpe.model ( #1162 )
2023-08-12 16:53:59 +08:00
Desh Raj
a4402b88e6
SURT multi-talker ASR recipe ( #1126 )
...
* merge upstream
* add SURT model and training
* add libricss decoding
* add chunk width randomization
* decode SURT with libricss
* initial commit for zipformer_ctc
* remove unwanted changes
* remove changes to other recipe
* fix zipformer softlink
* fix for JIT export
* add missing file
* fix symbolic links
* update results
* clean commit for SURT recipe
* training libricss surt model
* remove unwanted files
* remove unwanted changes
* remove changes in librispeech
* change some files to symlinks
* remove unwanted changes in utils
* add export script
* add README
* minor fix in README
* add assets for README
* replace some files with symlinks
* remove unused decoding methods
* fix symlink
* address comments from @csukuangfj
2023-07-04 19:25:58 +08:00
Nickolay V. Shmyrev
eca0202632
Add start-batch option for RNNLM training ( #1161 )
...
* Add start-batch option for RNNLM training
* Also set epoch
* Skip batches on load
2023-07-04 10:13:25 +08:00
Peter Ross
b4c38d7547
Use symlinks for best epochs ( #1123 )
...
* utils: add symlink_or_copyfile
* pruned_transducer_stateless7: use symlinks (when possible) to output best epochs
* Rename function
---------
Co-authored-by: Yifan Yang <64255737+yfyeung@users.noreply.github.com>
2023-06-12 13:51:46 +08:00
Wei Kang
ba257efbcd
Add Context biasing ( #1038 )
...
* Add context biasing for librispeech
* Add context biasing for wenetspeech
* fix bugs
* Implement Aho-Corasick context graph
* fix some bugs
* Fixes to forward_one_step; add draw to context graph
* add output arc; fix black
* Fix wenetspeech tokenizer
* Minor fixes to the decode.py
2023-06-03 21:28:49 +08:00
Zengwei Yao
7a604057f9
update diagnostics, print limits in Balancer, merge changes from Dan's branch zlm59 ( #1109 )
2023-06-01 14:24:19 +08:00
Zengwei Yao
6826b076d4
add flops profiler, support for Zipformer encoder and Conformer encoder ( #1093 )
...
* add flops profiler, support for Zipformer encoder and Conformer encoder
* support for reworked conformer and old zipformer
* skip black check
2023-05-24 19:10:45 +08:00
Fangjun Kuang
dbcf0b41db
Fix stateless7 training error ( #1082 )
2023-05-23 12:52:02 +08:00
Zengwei Yao
a7e142b7ff
Support long audios recognition ( #980 )
...
* support long file transcription
* rename recipe as long_file_recog
* add docs
* support multi-gpu decoding
* style fix
2023-05-19 20:27:55 +08:00
Zengwei Yao
f18b539fbc
Add the upgraded Zipformer model ( #1058 )
...
* add the zipformer codes, copied from branch from_dan_scaled_adam_exp1119
* support model export with torch.jit.script
* update RESULTS.md
* support exporting streaming model with torch.jit.script
* add results of streaming models, with some minor changes
* update README.md
* add CI test
* update k2 version in requirements-ci.txt
* update pyproject.toml
2023-05-19 16:47:59 +08:00
Wei Kang
bccd20d978
Traning with byte level BPE (TAL_CSASR) ( #1033 )
...
* Add byte level bpe tal_csasr recipe
* Minor fixes to decoding and exporting
* Fix prepare.sh
* Update results
2023-05-16 12:44:52 +08:00