1130 Commits

Author SHA1 Message Date
Kinan Martin
8beea2fbfb
Merge d74e2322e0d84f8d78f8594dbb6b8b4dc8b1b563 into 0904e490c5fb424dc5cb4d14ae468e4d32a07dc4 2025-11-28 11:47:38 +08:00
Fangjun Kuang
0904e490c5
Fix gigaspeech dataset iterator. (#2045)
Previously, it was reset after every epoch, which may cause it to
always use the first part of the gigaspeech dataset if you choose
a small --giga-prob.
2025-11-28 11:42:20 +08:00
Karel Vesely
693f069de7
zipformer/ctc_align.py (#2020)
* zipformer/ctc_align.py

- tool for forced-alignment with CTC model
- provides timeline, computes per-token and per-utterance acoustic confidences
- based on torchaudio `forced_align()`
- confidences are computed in several ways

other modifications:
- LibriSpeechAsrDataModel extended with `::load_manifest()` to allow
  passing-in cutset from CLI.
- update @custom_fwd @custom_bwd in scaling.py
- streaming_decode.py update errs/recogs/log filenames '-' <-> '_'

* putting back `custom_bwd`, `custom_fwd`

* integrating remarks from PR

* update of argparse help strings

* ctc_align.py, avoid shadowing a variable

* Finalizing the code:

- adding some coderabbit suggestions.
- removing `word_table`, `decoding_graph` from aligner API (unused)
- improved consistency of variable names (confidences)
- updated docstrings
2025-10-06 07:49:37 +08:00
Amir Hussein
729a5ba3ec
IWSLT-Ta ASR/ST (#1362)
This is a pull request for Dialectal IWSLT-Tunisian 2022 shared task https://iwslt.org/2022/dialect ASR and ST recipes.
2025-09-22 09:58:00 +08:00
Amir Hussein
855536d355
HENT-SRT (#2026)
HENT-SRT: Hierarchical Efficient Neural Transducer with Self-Distillation for Joint Speech Recognition and Translation

Paper: https://arxiv.org/abs/2506.02157
2025-09-20 00:17:53 +08:00
Fangjun Kuang
63563d16d3
Fix setting joiner dim (#2027)
Fixes incorrect computation of encoder_dim when encoder_dim is a comma-separated list of integers by ensuring numeric (not lexicographic) max is used.

Fixes #2018

- Replace int(max(params.encoder_dim.split(","))) (lexicographic max on strings) with max(_to_int_tuple(params.encoder_dim)) (numeric max).
- Apply the fix consistently across all affected training scripts.
2025-09-19 09:42:41 +08:00
Bailey Machiko Hirota
d74e2322e0
Merge branch 'master' into multi_ja_en_mls_english_clean 2025-09-10 23:35:30 -07:00
Bailey Machiko Hirota
8c846399a5
Update egs/mls_english/ASR/zipformer/streaming_decode.py
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
2025-09-11 15:32:18 +09:00
Bailey Machiko Hirota
9d389cdca7
Update egs/reazonspeech/ASR/local/compute_fbank_musan.py
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
2025-09-11 15:28:41 +09:00
Bailey Hirota
a30e80cb0f Remove accidentally added submodule musan-k2-v2-reazonspeech-medium 2025-09-11 15:18:33 +09:00
Kinan Martin
ecbe9851f0 Update streaming train and export commands 2025-09-04 10:57:11 +09:00
Kinan Martin
bc2560cb7a Update training commands and decode.py accuracy values, add streaming model section 2025-09-03 17:54:34 +09:00
Kinan Martin
ef7664e7cf
Update egs/mls_english/ASR/local/utils/asr_datamodule.py
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
2025-09-02 18:07:06 +09:00
Kinan Martin
f64a706191
Update egs/multi_ja_en/ASR/RESULTS.md
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
2025-09-02 18:05:58 +09:00
Bailey Machiko Hirota
9a940c3376
Update RESULTS.md 2025-09-02 11:48:58 +09:00
Bailey Machiko Hirota
2859c22995
Update RESULTS.md 2025-09-02 11:48:06 +09:00
Bailey Hirota
a4c1db5a49 reformat 2025-09-02 11:42:24 +09:00
Kinan Martin
7231cf44aa Remove changes to files outside of relevant recipes 2025-08-29 16:52:13 +09:00
Bailey Machiko Hirota
556a3f0941
Update README.md 2025-08-14 17:02:44 +09:00
Bailey Machiko Hirota
8e186160d1
Update RESULTS.md 2025-08-14 16:05:38 +09:00
Bailey Machiko Hirota
8c08c9c902
Create RESULTS.md 2025-08-14 11:53:16 +09:00
Bailey Hirota
5400f4315d training and decoding compatibility changes 2025-08-11 15:37:49 +09:00
Bailey Machiko Hirota
130c2a59c3
Merge branch 'multi_ja_en_mls_english_clean' into musan-mls-clean-final 2025-08-06 11:45:20 +09:00
Bailey Hirota
4e05d70f45 fix stash commit 2025-08-06 11:38:38 +09:00
Bailey Hirota
ee2a6d60e0 remove bilingual tag from train.py 2025-08-06 11:34:59 +09:00
Kinan Martin
0967f5f7d1 Manually fix merge conflict in multi_ja_en/ASR/zipformer/train.py 2025-08-06 11:32:30 +09:00
Bailey Hirota
636121c507 remove bilingual tag from train.py 2025-08-06 11:31:10 +09:00
Bailey Hirota
ed79fa3c04 revert unrelated transformer.py diffs from rebase 2025-08-05 21:44:26 +09:00
Bailey Hirota
c23af2ea1a musan implementation for mls_english 2025-08-05 19:18:34 +09:00
Fangjun Kuang
2d8e3fd858 Fix transformer decoder layer (#1995) 2025-08-05 19:16:10 +09:00
Bailey Machiko Hirota
11df2a83fc Musan implementation for ReazonSpeech (#1988) 2025-08-05 19:14:59 +09:00
Bailey Machiko Hirota
0ca7595d25 Update RESULTS.md 2025-08-05 19:13:48 +09:00
Bailey Hirota
8dd2c0f21b PR review suggestions implemented 2025-08-05 19:11:33 +09:00
Bailey Hirota
7b4abbaaac black and isort formatting 2025-08-05 19:10:23 +09:00
Bailey Machiko Hirota
2f1f419149 Update egs/multi_ja_en/ASR/local/utils/update_cutset_paths.py
Co-authored-by: Yubo <54519381+yuta0306@users.noreply.github.com>
2025-08-05 19:09:12 +09:00
Bailey Machiko Hirota
b19929c302 Update egs/multi_ja_en/ASR/local/utils/update_cutset_paths.py
Co-authored-by: Yubo <54519381+yuta0306@users.noreply.github.com>
2025-08-05 19:07:58 +09:00
Bailey Machiko Hirota
865b859e5d Update egs/multi_ja_en/ASR/local/utils/update_cutset_paths.py
Co-authored-by: Yubo <54519381+yuta0306@users.noreply.github.com>
2025-08-05 19:06:45 +09:00
Bailey Machiko Hirota
95f58e69fd Update egs/multi_ja_en/ASR/local/utils/update_cutset_paths.py
Co-authored-by: Yubo <54519381+yuta0306@users.noreply.github.com>
2025-08-05 19:05:35 +09:00
Bailey Hirota
60f326bf63 working changes for musan mixing 2025-08-05 19:04:24 +09:00
Bailey Hirota
a310d8fd5b attempt to fix musan paths 2025-08-05 19:03:13 +09:00
Bailey Hirota
44758153df update musan symlinks 2025-08-05 19:02:04 +09:00
Bailey Hirota
aeffb15dab update musan paths 2025-08-05 19:00:55 +09:00
Bailey Hirota
6272827db3 update musan path 2025-08-05 18:59:44 +09:00
Bailey Hirota
c610c6d56a resolve typos and import issues 2025-08-05 18:58:34 +09:00
Bailey Hirota
1cf544b513 remove comment 2025-08-05 18:57:21 +09:00
Bailey Hirota
5fb4bdf9e7 commenting 2025-08-05 18:56:05 +09:00
Bailey Hirota
ed2c0a4597 typos 2025-08-05 18:54:52 +09:00
Bailey Hirota
199650781f changes to asr_datamodule for musan support 2025-08-05 18:53:37 +09:00
Kinan Martin
694ecb907a make prepare.sh symlinks relative 2025-08-05 18:51:16 +09:00
Bailey Hirota
9c91775a51 remove unused local scripts 2025-08-05 18:49:58 +09:00