icefall

Author	SHA1	Message	Date
Fangjun Kuang	fba5e67d5e	Fix CI tests. (#1974 ) - Introduce unified AMP helpers (create_grad_scaler, torch_autocast) to handle deprecations in PyTorch ≥2.3.0 - Replace direct uses of torch.cuda.amp.GradScaler and torch.cuda.amp.autocast with the new utilities across all training and inference scripts - Update all torch.load calls to include weights_only=False for compatibility with newer PyTorch versions	2025-07-01 13:47:55 +08:00
Desh Raj	107df3b115	apply black on all files	2022-11-17 09:42:17 -05:00
Fangjun Kuang	60317120ca	Revert "Apply new Black style changes"	2022-11-17 20:19:32 +08:00
Desh Raj	d110b04ad3	apply new black formatting to all files	2022-11-16 13:06:43 -05:00
Fangjun Kuang	d1f16a04bd	fix type hints for decode.py (#623 )	2022-10-18 06:56:12 +08:00
Wei Kang	5c17255eec	Sort results to make it more convenient to compare decoding results (#522 ) * Sort result to make it more convenient to compare decoding results * Add cut_id to recognition results * add cut_id to results for all recipes * Fix torch.jit.script * Fix comments * Minor fixes * Fix torch.jit.tracing for Pytorch version before v1.9.0	2022-08-12 07:12:50 +08:00
ezerhouni	0475d75d15	[Ready to be merged] Add RNN-LM to Conformer-CTC decoding (#439 )	2022-06-23 19:37:03 +08:00
Fangjun Kuang	1d44da845b	RNN-T Conformer training for LibriSpeech (#143 ) * Begin to add RNN-T training for librispeech. * Copy files from conformer_ctc. Will edit it. * Use conformer/transformer model as encoder. * Begin to add training script. * Add training code. * Remove long utterances to avoid OOM when a large max_duraiton is used. * Begin to add decoding script. * Add decoding script. * Minor fixes. * Add beam search. * Use LSTM layers for the encoder. Need more tunings. * Use stateless decoder. * Minor fixes to make it ready for merge. * Fix README. * Update RESULT.md to include RNN-T Conformer. * Minor fixes. * Fix tests. * Minor fixes. * Minor fixes. * Fix tests.	2021-12-18 07:42:51 +08:00
Fangjun Kuang	ec591698b0	Associate a cut with token alignment (without repeats) (#125 ) * WIP: Associate a cut with token alignment (without repeats) * Save framewise alignments with/without repeats. * Minor fixes.	2021-11-29 18:50:54 +08:00
Fangjun Kuang	243fb9723c	Fix an error introduced while supporting torchscript. (#134 ) Should be `G.dummy = 1`, not `G["dummy"] = 1`.	2021-11-27 09:07:04 +08:00
Wei Kang	4151cca147	Add torch script support for Aishell and update documents (#124 ) * Add aishell recipe * Remove unnecessary code and update docs * adapt to k2 v1.7, add docs and results * Update conformer ctc model * Update docs, pretrained.py & results * Fix code style * Fix code style * Fix code style * Minor fix * Minor fix * Fix pretrained.py * Update pretrained model & corresponding docs * Export torch script model for Aishell * Add C++ deployment docs * Minor fixes * Fix unit test * Update Readme	2021-11-19 16:37:05 +08:00
Fangjun Kuang	0660d12e4e	Fix computing WERs for empty hypotheses (#118 ) * Fix computing WERs when empty lattices are generated. * Minor fixes.	2021-11-17 19:25:47 +08:00
Fangjun Kuang	68506609ad	Set fsa.properties to None after changing its labels in-place. (#121 )	2021-11-16 23:11:30 +08:00
Fangjun Kuang	8d679c3e74	Fix typos. (#115 )	2021-11-10 14:45:30 +08:00
Fangjun Kuang	21096e99d8	Update result for the librispeech recipe using vocab size 500 and att rate 0.8 (#113 ) * Update RESULTS using vocab size 500, att rate 0.8 * Update README. * Refactoring. Since FSAs in an Nbest object are linear in structure, we can add the scores of a path to compute the total scores. * Update documentation. * Change default vocab size from 5000 to 500.	2021-11-10 14:32:52 +08:00
Fangjun Kuang	8cb7f712e4	Use GPU for averaging checkpoints if possible. (#84 )	2021-10-26 17:10:04 +08:00
Fangjun Kuang	712ead8207	Fix an error when attention decoder rescoring returns None. (#90 )	2021-10-22 19:52:25 +08:00
Fangjun Kuang	4890e27b45	Extract framewise alignment information using CTC decoding (#39 ) * Use new APIs with k2.RaggedTensor * Fix style issues. * Update the installation doc, saying it requires at least k2 v1.7 * Extract framewise alignment information using CTC decoding. * Print environment information. Print information about k2, lhotse, PyTorch, and icefall. * Fix CI. * Fix CI. * Compute framewise alignment information of the LibriSpeech dataset. * Update comments for the time to compute alignments of train-960. * Preserve cut id in mix cut transformer. * Minor fixes. * Add doc about how to extract framewise alignments.	2021-10-18 14:24:33 +08:00
Fangjun Kuang	707d7017a7	Support pure ctc decoding requiring neither a lexicon nor an n-gram LM (#58 ) * Rename lattice_score_scale to nbest_scale. * Support pure CTC decoding requiring neither a lexicion nor an n-gram LM. * Fix style issues. * Fix a typo. * Minor fixes.	2021-09-26 14:21:49 +08:00
Fangjun Kuang	a80e58e15d	Refactor decode.py to make it more readable and more modular. (#44 ) * Refactor decode.py to make it more readable and more modular. * Fix an error. Nbest.fsa should always have token IDs as labels and word IDs as aux_labels. * Add nbest decoding. * Compute edit distance with k2. * Refactor nbest-oracle. * Add rescore with nbest lists. * Add whole-lattice rescoring. * Add rescoring with attention decoder. * Refactoring. * Fixes after refactoring. * Fix a typo. * Minor fixes. * Replace [] with () for shapes. * Use k2 v1.9 * Use Levenshtein graphs/alignment from k2 v1.9 * [doc] Require k2 >= v1.9 * Minor fixes.	2021-09-20 15:44:54 +08:00
Wei Kang	24656e9749	Update docs and remove unnecessary arguments (#42 ) * Fix typo in docs * Update docs and remove unnecessary arguments * Fix code style	2021-09-13 18:28:57 +08:00
Fangjun Kuang	f792b466bf	Change default value of lattice-score-scale from 1.0 to 0.5 (#41 ) * Change the default value of lattice-score-scale from 1.0 to 0.5 * Fix CI.	2021-09-13 10:49:18 +08:00
Fangjun Kuang	abadc71415	Use new APIs with k2.RaggedTensor (#38 ) * Use new APIs with k2.RaggedTensor * Fix style issues. * Update the installation doc, saying it requires at least k2 v1.7 * Use k2 v1.7	2021-09-08 14:55:30 +08:00
Fangjun Kuang	184dbb3ea5	Add documentation about code style and creating new recipes. (#27 )	2021-08-25 14:48:41 +08:00
pkufool	f4223ee110	Add TDNN-LSTM-CTC Results (#25 ) * Add tdnn-lstm pretrained model and results * Add docs for TDNN-LSTM-CTC * Minor fix * Fix typo * Fix style checking	2021-08-24 21:09:27 +08:00
Fangjun Kuang	1bd5dcc8ac	WIP: Add doc for the LibriSpeech recipe. (#24 ) * WIP: Add doc for the LibriSpeech recipe. * Add more doc for LibriSpeech recipe. * Add more doc for the LibriSpeech recipe. * More doc.	2021-08-24 20:28:32 +08:00
pkufool	19c4214958	Fix code style and add copyright. (#18 ) * Fix style and add copyright * Minor fix * Remove duplicate lines * Reformat conformer.py by black * Reformat code style with black. * Fix github workflows * Fix lhotse installation * Install icefall requirements * Update k2 version, remove lhotse from test workflow	2021-08-23 10:43:59 +08:00
Fangjun Kuang	8469f9ae0a	Refactor asr_datamodule. (#15 ) * WIP: Refactor asr_datamodule. * Fixes after review. * Minor fixes.	2021-08-21 09:53:46 +08:00
Fangjun Kuang	9d0cc9d829	Support computing nbest oracle WER. (#10 ) * Support computing nbest oracle WER. * Add scale to all nbest based decoding/rescoring methods. * Add script to run pretrained models. * Use torchaudio to extract features. * Support decoding multiple files at the same time. Also, use kaldifeat for feature extraction. * Support decoding with LM rescoring and attention-decoder rescoring. * Minor fixes. * Replace scale with lattice-score-scale. * Add usage example with a provided pretrained model.	2021-08-20 11:53:37 +08:00
pkufool	ef233486ae	The training script produce WER of 2.57% on librispeech test-clean (#13 ) * Add grad_clip and weight-decay, small fix of dataloader and masking * Add RESULTS.md	2021-08-20 10:08:08 +08:00
Fangjun Kuang	caa0b9e942	Fix an error in displaying decoding process. (#12 )	2021-08-19 14:54:01 +08:00
Fangjun Kuang	5a0b9bcb23	Refactoring (#4 ) * Fix an error in TDNN-LSTM training. * WIP: Refactoring * Refactor transformer.py * Remove unused code. * Minor fixes.	2021-08-04 14:53:02 +08:00
Fangjun Kuang	398ed80d7a	Minor fixes to support DDP training.	2021-07-31 15:26:57 +08:00
Fangjun Kuang	bd69e4be32	Use attention decoder for rescoring.	2021-07-28 12:22:09 +08:00
Fangjun Kuang	f65854cca5	Add BPE decoding results.	2021-07-27 17:38:47 +08:00
Fangjun Kuang	4ccae509d3	WIP: Begin to add BPE decoding	2021-07-26 20:06:58 +08:00

36 Commits