icefall

Author	SHA1	Message	Date
Fangjun Kuang	fba5e67d5e	Fix CI tests. (#1974 ) - Introduce unified AMP helpers (create_grad_scaler, torch_autocast) to handle deprecations in PyTorch ≥2.3.0 - Replace direct uses of torch.cuda.amp.GradScaler and torch.cuda.amp.autocast with the new utilities across all training and inference scripts - Update all torch.load calls to include weights_only=False for compatibility with newer PyTorch versions	2025-07-01 13:47:55 +08:00
Fangjun Kuang	3059eb4511	Fix doc URLs (#1660 )	2024-06-21 11:10:14 +08:00
zr_jin	9d570870cf	Update asr_datamodule.py (#1619 )	2024-05-07 21:37:55 +08:00
zr_jin	242002e0bd	Strengthened style constraints (#1527 )	2024-03-04 23:28:04 +08:00
Yifan Yang	5dfc3ed7f9	Fix buffer size of DynamicBucketingSampler (#1468 ) * Fix buffer size * Fix for flake8 --------- Co-authored-by: yifanyeung <yifanyeung@yifanyeung.local>	2024-01-21 02:10:42 +08:00
Fangjun Kuang	8136ad775b	Use high_freq -400 in computing fbank features. (#1447 ) See also https://github.com/k2-fsa/sherpa-onnx/issues/514	2024-01-04 13:59:32 +08:00
Fangjun Kuang	f14b673408	Add HLG decoding with OpenFst on CPU for aishell conformer_ctc (#1279 )	2023-10-01 13:46:16 +08:00
Fangjun Kuang	2318c3fbd0	Support CTC decoding on CPU using OpenFst and kaldi decoders. (#1244 )	2023-09-26 16:36:19 +08:00
Fangjun Kuang	f5dc957d44	Fix CI tests (#1266 )	2023-09-21 21:16:14 +08:00
zr_jin	7cc2dae940	Fixes to incorporate with the latest Lhotse release (#1249 )	2023-09-13 12:39:49 +08:00
Fangjun Kuang	fc2df07841	Add icefall tutorials for dummies. (#1220 )	2023-08-16 22:32:41 +08:00
Piotr Żelasko	b0e8a40c89	Speed up yesno training to finish in ~10s on CPU (#1215 )	2023-08-13 09:50:59 +08:00
Fangjun Kuang	dfccadc6b6	Fix a typo in export_onnx.py for yesno (#1213 )	2023-08-12 16:59:06 +08:00
Fangjun Kuang	d6b28a11a7	Add export script for the yesno recipe. (#1212 )	2023-08-11 23:57:00 +08:00
Fangjun Kuang	6533f359c9	Fix CI (#726 ) * Fix CI * Disable shuffle for yesno. See https://github.com/k2-fsa/icefall/issues/197	2022-12-02 10:53:06 +08:00
Desh Raj	d31db01037	manual correction of black formatting	2022-11-17 14:18:05 -05:00
Desh Raj	107df3b115	apply black on all files	2022-11-17 09:42:17 -05:00
Fangjun Kuang	60317120ca	Revert "Apply new Black style changes"	2022-11-17 20:19:32 +08:00
Desh Raj	d110b04ad3	apply new black formatting to all files	2022-11-16 13:06:43 -05:00
Fangjun Kuang	d1f16a04bd	fix type hints for decode.py (#623 )	2022-10-18 06:56:12 +08:00
Wei Kang	5c17255eec	Sort results to make it more convenient to compare decoding results (#522 ) * Sort result to make it more convenient to compare decoding results * Add cut_id to recognition results * add cut_id to results for all recipes * Fix torch.jit.script * Fix comments * Minor fixes * Fix torch.jit.tracing for Pytorch version before v1.9.0	2022-08-12 07:12:50 +08:00
Fangjun Kuang	f1abce72f8	Use jsonl for CutSet in the LibriSpeech recipe. (#397 ) * Use jsonl for cutsets in the librispeech recipe. * Use lazy cutset for all recipes. * More fixes to use lazy CutSet. * Remove force=True from logging to support Python < 3.8 * Minor fixes. * Fix style issues.	2022-06-06 10:19:16 +08:00
Fangjun Kuang	1c35ae1dba	Reset seed at the beginning of each epoch. (#221 ) * Reset seed at the beginning of each epoch. * Use a different seed for each epoch.	2022-02-21 15:16:39 +08:00
Fangjun Kuang	4c1b3665ee	Use optimized_transducer to compute transducer loss. (#162 ) * WIP: Use optimized_transducer to compute transducer loss. * Minor fixes. * Fix decoding. * Fix decoding. * Add RESULTS. * Update RESULTS. * Update CI. * Fix sampling rate for yesno recipe.	2022-01-10 11:54:58 +08:00
Fangjun Kuang	95af039733	RNN-T training for yesno. (#141 ) * RNN-T training for yesno. * Rename Jointer to Joiner.	2021-12-07 21:44:37 +08:00
Wei Kang	4151cca147	Add torch script support for Aishell and update documents (#124 ) * Add aishell recipe * Remove unnecessary code and update docs * adapt to k2 v1.7, add docs and results * Update conformer ctc model * Update docs, pretrained.py & results * Fix code style * Fix code style * Fix code style * Minor fix * Minor fix * Fix pretrained.py * Update pretrained model & corresponding docs * Export torch script model for Aishell * Add C++ deployment docs * Minor fixes * Fix unit test * Update Readme	2021-11-19 16:37:05 +08:00
Fangjun Kuang	4890e27b45	Extract framewise alignment information using CTC decoding (#39 ) * Use new APIs with k2.RaggedTensor * Fix style issues. * Update the installation doc, saying it requires at least k2 v1.7 * Extract framewise alignment information using CTC decoding. * Print environment information. Print information about k2, lhotse, PyTorch, and icefall. * Fix CI. * Fix CI. * Compute framewise alignment information of the LibriSpeech dataset. * Update comments for the time to compute alignments of train-960. * Preserve cut id in mix cut transformer. * Minor fixes. * Add doc about how to extract framewise alignments.	2021-10-18 14:24:33 +08:00
Mingshuang Luo	391432b356	Update train.py ("10"--->"params.log_interval") (#76 ) * Update train.py * Update train.py * Update train.py	2021-10-12 21:30:31 +08:00
Mingshuang Luo	597c5efdb1	Use LossRecord to record and print the loss for the training process (#62 ) * Update index.rst (AS->ASR) * Update conformer_ctc.rst (pretraind->pretrained) * Fix some spelling errors. * Fix some spelling errors. * Use LossRecord to record and print loss in the training process * Change the name "LossRecord" to "MetricsTracker"	2021-10-12 15:58:03 +08:00
Piotr Żelasko	069ebaf9ba	Reformatting	2021-10-09 14:45:46 +00:00
Piotr Żelasko	b682467e4d	Use BucketingSampler for dev and test data	2021-10-08 22:32:13 -04:00
Fangjun Kuang	707d7017a7	Support pure ctc decoding requiring neither a lexicon nor an n-gram LM (#58 ) * Rename lattice_score_scale to nbest_scale. * Support pure CTC decoding requiring neither a lexicion nor an n-gram LM. * Fix style issues. * Fix a typo. * Minor fixes.	2021-09-26 14:21:49 +08:00
Fangjun Kuang	a80e58e15d	Refactor decode.py to make it more readable and more modular. (#44 ) * Refactor decode.py to make it more readable and more modular. * Fix an error. Nbest.fsa should always have token IDs as labels and word IDs as aux_labels. * Add nbest decoding. * Compute edit distance with k2. * Refactor nbest-oracle. * Add rescore with nbest lists. * Add whole-lattice rescoring. * Add rescoring with attention decoder. * Refactoring. * Fixes after refactoring. * Fix a typo. * Minor fixes. * Replace [] with () for shapes. * Use k2 v1.9 * Use Levenshtein graphs/alignment from k2 v1.9 * [doc] Require k2 >= v1.9 * Minor fixes.	2021-09-20 15:44:54 +08:00
Fangjun Kuang	abadc71415	Use new APIs with k2.RaggedTensor (#38 ) * Use new APIs with k2.RaggedTensor * Fix style issues. * Update the installation doc, saying it requires at least k2 v1.7 * Use k2 v1.7	2021-09-08 14:55:30 +08:00
Fangjun Kuang	184dbb3ea5	Add documentation about code style and creating new recipes. (#27 )	2021-08-25 14:48:41 +08:00
Fangjun Kuang	1bd5dcc8ac	WIP: Add doc for the LibriSpeech recipe. (#24 ) * WIP: Add doc for the LibriSpeech recipe. * Add more doc for LibriSpeech recipe. * Add more doc for the LibriSpeech recipe. * More doc.	2021-08-24 20:28:32 +08:00
Fangjun Kuang	01da00dca0	WIP: Add documentation. (#22 ) * Begin to add documentation. * WIP: Add documentation. * Fix a typo. * Add more doc for the recipe yesno. * Add more doc for the yesno recipe.	2021-08-24 14:28:08 +08:00
Fangjun Kuang	57cb611665	[yesno] Remove padding in TDNN (#21 ) * Disable SpecAug for yesno. Also replace Adam with SGD. * Remove padding in the model to make the results reproducible.	2021-08-23 15:59:36 +08:00
Fangjun Kuang	6c2c9b9d74	Add recipe for the yes_no dataset. (#16 ) * Add recipe for the yes_no dataset. * Refactoring: Remove unused code. * Add Colab notebook for the yesno dataset. * Add GitHub actions to run yesno. * Fix a typo. * Minor fixes. * Train more epochs for GitHub actions. * Minor fixes. * Minor fixes. * Fix style issues.	2021-08-23 11:36:29 +08:00

39 Commits