marcoyang
25f465c8c0
Merge branch 'master' of github.com:marcoyang1998/icefall into mls_recipe
2024-02-28 13:37:14 +08:00
marcoyang
1f1f2cc449
minor fix
2024-02-28 12:11:16 +08:00
marcoyang
ef2b95cb29
data preparation for MLS
2024-02-28 12:10:37 +08:00
marcoyang
15e982aeba
add files
2024-02-28 11:57:31 +08:00
marcoyang
7258271414
support decoding
2024-02-28 11:56:38 +08:00
marcoyang
b6f3a2b186
compute fbank features for mls
2024-02-27 18:03:15 +08:00
marcoyang
6ae09f5836
add missing file
2024-02-27 18:02:51 +08:00
marcoyang
2b2da21208
update the training script
2024-02-27 18:02:33 +08:00
marcoyang
a9edd7cc3d
update the asr datamodule
2024-02-27 18:02:16 +08:00
marcoyang
ab76630e0d
change to bpe
2024-02-27 18:02:03 +08:00
marcoyang
594abc7975
add files
2024-02-27 16:05:58 +08:00
marcoyang
1efb1baff5
initial commit
2024-02-27 15:39:09 +08:00
marcoyang
8f2183c637
copy files from librispeech
2024-02-27 12:36:46 +08:00
Fangjun Kuang
291d06056c
Support torch 2.2.1 for cpu docker. ( #1516 )
2024-02-23 14:24:13 +08:00
Xiaoyu Yang
2483b8b4da
Zipformer recipe for SPGISpeech ( #1449 )
2024-02-22 15:53:19 +08:00
Wei Kang
819bb45539
Add pypinyin to requirements ( #1515 )
2024-02-22 15:50:11 +08:00
Wei Kang
aac7df064a
Recipes for open vocabulary keyword spotting ( #1428 )
...
* English recipe on gigaspeech; Chinese recipe on wenetspeech
2024-02-22 15:31:20 +08:00
Xiaoyu Yang
13daf73468
docs for finetune zipformer ( #1509 )
2024-02-21 18:06:27 +08:00
Wei Kang
c19b414778
Update docker (adding pypinyin ( #1513 )
...
Update docker (adding pypinyin)
2024-02-21 08:04:16 +08:00
zr_jin
027302c902
minor fix for param. names ( #1495 )
2024-02-20 14:38:51 +08:00
Karel Vesely
e59fa38e86
docs: minor fixes of LM rescoring texts ( #1498 )
2024-02-20 10:40:15 +08:00
Zengwei Yao
b3e2044068
minor fix of vits/tokenizer.py ( #1504 )
...
* minor fix of vits/tokenizer.py
2024-02-19 19:33:32 +08:00
zr_jin
db4d66c0e3
Fixed softlink for ljspeech
recipe ( #1503 )
2024-02-19 16:13:09 +08:00
Fangjun Kuang
7eb360d0d5
Fix cpu docker images for torch 2.2.0 ( #1502 )
2024-02-18 20:32:40 +08:00
Fangjun Kuang
17688476e5
Provider docker images for torch 2.2.0 ( #1501 )
2024-02-18 14:56:04 +08:00
Fangjun Kuang
06b356a610
Update cpu docker images to support torch 2.2.0 ( #1499 )
2024-02-18 12:05:38 +08:00
safarisadegh
d9ae8c02a0
Update README.md ( #1497 )
2024-02-09 15:05:01 +08:00
Wei Kang
711d6bc462
Refactor prepare.sh in librispeech ( #1493 )
...
* Refactor prepare.sh in librispeech, break it into three parts, prepare.sh (basic, minimal requirement for transducer), prepare_lm.sh (ngram & nnlm staff), prepare_mmi.sh (for MMI training).
2024-02-09 10:44:19 +08:00
Tiance Wang
4ed88d9484
Update shared ( #1487 )
...
There should be one more ../
2024-02-07 10:16:02 +08:00
Xiaoyu Yang
777074046d
Fine-tune recipe for Zipformer ( #1484 )
...
1. support finetune zipformer
2. update the usage; set a very large batch count
2024-02-06 18:25:43 +08:00
zr_jin
a813186f64
minor fix for docstr and default param. ( #1490 )
...
* Update train.py and README.md
2024-02-05 12:47:52 +08:00
Teo Wen Shen
b9e6327adf
Fixing torch.ctc err ( #1485 )
...
* fixing torch.ctc err
* Move targets & lengths to CPU
2024-02-03 06:25:27 +08:00
Henry Li Xinyuan
b07d5472c5
Implement recipe for Fluent Speech Commands dataset ( #1469 )
...
---------
Signed-off-by: Xinyuan Li <xli257@c13.clsp.jhu.edu>
2024-01-31 22:53:36 +08:00
zr_jin
37b975cac9
fixed a CI test for wenetspeech
( #1476 )
...
* Comply to issue #1149
https://github.com/k2-fsa/icefall/issues/1149
2024-01-27 06:41:56 +08:00
Yuekai Zhang
1c30847947
Whisper Fine-tuning Recipe on Aishell1 ( #1466 )
...
* add decode seamlessm4t
* add requirements
* add decoding with avg model
* add token files
* add custom tokenizer
* support deepspeed to finetune large model
* support large-v3
* add model saving
* using monkey patch to replace models
* add manifest dir option
2024-01-27 00:32:30 +08:00
Fangjun Kuang
8d39f9508b
Fix torchscript export to use tokens.txt instead of lang_dir ( #1475 )
2024-01-26 19:18:33 +08:00
Zengwei Yao
c401a2646b
minor fix of zipformer/optim.py ( #1474 )
2024-01-26 15:50:11 +08:00
zr_jin
9c494a3329
typos fixed ( #1472 )
2024-01-25 18:41:43 +08:00
Yifan Yang
559ed150bb
Fix typo ( #1471 )
2024-01-23 22:51:09 +08:00
zr_jin
ebe97a07b0
Reworked README.md ( #1470 )
...
* Rework README.md
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
---------
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
2024-01-23 16:26:24 +08:00
Yifan Yang
5dfc3ed7f9
Fix buffer size of DynamicBucketingSampler ( #1468 )
...
* Fix buffer size
* Fix for flake8
---------
Co-authored-by: yifanyeung <yifanyeung@yifanyeung.local>
2024-01-21 02:10:42 +08:00
zr_jin
7bdde9174c
A Zipformer recipe with Byte-level BPE for Aishell-1 ( #1464 )
...
* init commit
* Update train.py
* Update decode.py
* Update RESULTS.md
* added `vocab_size`
* removed unused softlinks
* added scripts for testing pretrained models
* set `bpe_model` as required
* re-org the bbpe recipe for aishell
2024-01-16 21:08:35 +08:00
Fangjun Kuang
398401ed27
Update kaldifeat installation doc ( #1460 )
2024-01-14 14:38:41 +08:00
Xiaoyu Yang
e2fcb42f5f
fix typo ( #1455 )
2024-01-09 15:41:37 +08:00
zr_jin
5445ea6df6
Use shuffled LibriSpeech cuts instead ( #1450 )
...
* use shuffled LibriSpeech cuts instead
* leave the old code in comments for reference
2024-01-08 15:09:21 +08:00
zr_jin
b9b56eb879
Minor fixes to the VCTK data prep scripts ( #1441 )
...
* Update prepare.sh
2024-01-08 14:28:07 +08:00
Karel Vesely
716b82cc3a
streaming_decode.py, relax the audio range from [-1,+1] to [-10,+10] ( #1448 )
...
- some AudioTransform classes produce audio signals out of range [-1,+1]
- Resample produced 1.0079
- The range [-10,+10] was chosen to still be able to reliably
distinguish from the [-32k,+32k] signal...
- this is related to : https://github.com/lhotse-speech/lhotse/issues/1254
2024-01-05 10:21:27 +08:00
Fangjun Kuang
8136ad775b
Use high_freq -400 in computing fbank features. ( #1447 )
...
See also https://github.com/k2-fsa/sherpa-onnx/issues/514
2024-01-04 13:59:32 +08:00
zr_jin
f42258caf8
Update compute_fbank_commonvoice_splits.py ( #1437 )
2023-12-30 13:03:26 +08:00
Fangjun Kuang
140e6381ad
Refactor CI tests for librispeech ( #1436 )
2023-12-27 13:21:14 +08:00