icefall

mirror of https://github.com/k2-fsa/icefall.git synced 2025-12-11 06:55:27 +00:00

Author	SHA1	Message	Date
Guo Liyong	78418ac37c	fix comments	2022-04-13 13:09:24 +08:00
Mingshuang Luo	93c60a9d30	Code style check for librispeech pruned transducer stateless2 (#308 )	2022-04-11 22:15:18 +08:00
Daniel Povey	6eb6d9b4cd	Merge pull request #288 from danpovey/reworked_model Reworked model	2022-04-11 15:03:08 +08:00
Wei Kang	f721a2fd7a	Minor fixes for logging (#296 ) * Minor fixes for logging * Minor fix	2022-04-10 23:34:18 +08:00
Zengwei Yao	08473a17aa	Modify init (#301 ) * update icefall/__init__.py to import more common functions. * update icefall/__init__.py * make imports style consistent. * exclude black check for icefall/__init__.py in pyproject.toml.	2022-04-10 23:29:28 +08:00
Daniel Povey	d1e4ae788d	Refactor how learning rate is set.	2022-04-10 15:25:27 +08:00
Fangjun Kuang	7c0070e6f6	Display torch version in the training log. (#299 )	2022-04-08 11:39:54 +08:00
Zengwei Yao	ceeb95bcb8	update icefall/__init__.py to import more common functions. (#294 )	2022-04-06 11:55:29 +08:00
Fangjun Kuang	87cf9231ea	Support specifying iteration number of checkpoints for decoding. (#289 )	2022-04-03 13:02:08 +08:00
Zengwei Yao	0b6a2213c3	Modify icefall/__init__.py. (#287 ) * Modify icefall/__init__.py to import common functions defined in icefall/utils.py. * Modify icefall/__init__.py and .flake8.	2022-04-02 15:01:45 +08:00
LIyong.Guo	fc40bfea82	fix typo of torch.eig (#281 ) Co-authored-by: glynpu <glynwpu@qq.com>	2022-03-31 10:43:46 +08:00
Mingshuang Luo	f686635b54	Update diagnostics (#260 ) * update diagnostics.py	2022-03-30 14:52:55 +08:00
Fangjun Kuang	ae564f91e6	Periodically saving checkpoint after processing given number of batches (#259 ) * Periodically saving checkpoint after processing given number of batches.	2022-03-20 23:51:33 +08:00
Mingshuang Luo	518ec6414a	Update diagnostics.py (#254 ) * update diagnostics.py * do some changes	2022-03-16 20:17:45 +08:00
yaozengwei	ad62981765	Add diagnostics (#230 ) * Adding diagnostics code... * Move diagnostics code from local dir to the shared icefall dir * Remove the diagnostics code in the local dir * Update docs of arguments, and remove stats_types() function in TensorDiagnosticOptions object. * Update docs of arguments. * Add copyright information. * Corrected the time in copyright information. Co-authored-by: Daniel Povey <dpovey@gmail.com>	2022-03-04 15:38:23 +08:00
Fangjun Kuang	cbf8c18ebd	Minor fixes for aishell (#218 ) * Minor fixes to aishell. * Minor fixes.	2022-02-19 22:28:19 +08:00
Wei Kang	b702281e90	Use k2 pruned transducer loss to train conformer-transducer model (#194 ) * Using k2 pruned version transducer loss to train model * Fix style * Minor fixes	2022-02-17 13:33:54 +08:00
Wang, Guanbo	e8eb408760	Incremental pruning threshold (#214 ) * Incremental pruning threshold * flake8 * black * minor fix	2022-02-16 16:59:27 +08:00
Wang, Guanbo	be1c86b06c	print num_frame as %.2f (#204 )	2022-02-08 14:56:58 +08:00
Piotr Żelasko	f92c24a73a	Merge branch 'master' into feature/libri-conformer-phone-ctc	2022-01-24 10:18:56 -05:00
Piotr Żelasko	f0f35e6671	black	2022-01-21 17:22:41 -05:00
Piotr Żelasko	3d109b121d	Remove train_phones.py and modify train.py instead	2022-01-21 17:08:53 -05:00
huangruizhe	298faabb90	minor fixes	2022-01-02 23:38:33 -08:00
huangruizhe	7577b08bed	fixed the mistake	2022-01-02 23:32:43 -08:00
huangruizhe	82c8fac6ee	fixed a case where BOW can have problem to compute (ZeroDivisionError)	2022-01-02 15:29:50 -08:00
huangruizhe	0a67015d63	Update make_kn_lm.py	2022-01-02 00:27:27 -08:00
huangruizhe	49aab7e658	Update make_kn_lm.py Fixed issue #163	2022-01-02 00:14:27 -08:00
Fangjun Kuang	95af039733	RNN-T training for yesno. (#141 ) * RNN-T training for yesno. * Rename Jointer to Joiner.	2021-12-07 21:44:37 +08:00
Fangjun Kuang	ec591698b0	Associate a cut with token alignment (without repeats) (#125 ) * WIP: Associate a cut with token alignment (without repeats) * Save framewise alignments with/without repeats. * Minor fixes.	2021-11-29 18:50:54 +08:00
Fangjun Kuang	0e541f5b5d	Print hostname and IP address to the log. (#131 ) We are using multiple machines to do various experiments. It makes life easier to know which experiment is running on which machine if we also log the IP and hostname of the machine.	2021-11-26 11:25:59 +08:00
Piotr Żelasko	8eb94fa4a0	CTC-only phone conformer recipe for LibriSpeech	2021-11-23 15:34:46 -05:00
Wei Kang	4151cca147	Add torch script support for Aishell and update documents (#124 ) * Add aishell recipe * Remove unnecessary code and update docs * adapt to k2 v1.7, add docs and results * Update conformer ctc model * Update docs, pretrained.py & results * Fix code style * Fix code style * Fix code style * Minor fix * Minor fix * Fix pretrained.py * Update pretrained model & corresponding docs * Export torch script model for Aishell * Add C++ deployment docs * Minor fixes * Fix unit test * Update Readme	2021-11-19 16:37:05 +08:00
Wei Kang	30c43b7f69	Add aishell recipe (#30 ) * Add aishell recipe * Remove unnecessary code and update docs * adapt to k2 v1.7, add docs and results * Update conformer ctc model * Update docs, pretrained.py & results * Fix code style * Fix code style * Fix code style * Minor fix * Minor fix * Fix pretrained.py * Update pretrained model & corresponding docs	2021-11-18 10:00:47 +08:00
Fangjun Kuang	5b10310bd1	Handle empty lattices in attention decoder rescoring.	2021-11-11 15:42:30 +08:00
Fangjun Kuang	8d679c3e74	Fix typos. (#115 )	2021-11-10 14:45:30 +08:00
Fangjun Kuang	21096e99d8	Update result for the librispeech recipe using vocab size 500 and att rate 0.8 (#113 ) * Update RESULTS using vocab size 500, att rate 0.8 * Update README. * Refactoring. Since FSAs in an Nbest object are linear in structure, we can add the scores of a path to compute the total scores. * Update documentation. * Change default vocab size from 5000 to 500.	2021-11-10 14:32:52 +08:00
Fangjun Kuang	04029871b6	Fix a bug in Nbest.compute_am_scores and Nbest.compute_lm_scores. (#111 )	2021-11-09 13:44:51 +08:00
Fangjun Kuang	91cfecebf2	Remove duplicated token seq in rescoring. (#108 ) * Remove duplicated token seq in rescoring. * Use a larger range for ngram_lm_scale and attention_scale	2021-11-06 08:54:45 +08:00
Fangjun Kuang	12d647d899	Add a note about the CUDA OOM error. (#94 ) * Add a note about the CUDA OOM error. Some users consider this kind of OOM as an error during decoding, but actually it is not. This pull request clarifies that. * Fix style issues.	2021-10-29 12:17:56 +08:00
Fangjun Kuang	8cb7f712e4	Use GPU for averaging checkpoints if possible. (#84 )	2021-10-26 17:10:04 +08:00
Fangjun Kuang	53b79fafa7	Add MMI training with word pieces as modelling unit. (#6 ) * Fix an error in TDNN-LSTM training. * WIP: Refactoring * Refactor transformer.py * Remove unused code. * Minor fixes. * Fix decoder padding mask. * Add MMI training with word pieces. * Remove unused files. * Minor fixes. * Refactoring. * Minor fixes. * Use pre-computed alignments in LF-MMI training. * Minor fixes. * Update decoding script. * Add doc about how to check and use extracted alignments. * Fix style issues. * Fix typos. * Fix style issues. * Disable macOS tests for now.	2021-10-18 15:20:32 +08:00
Fangjun Kuang	4890e27b45	Extract framewise alignment information using CTC decoding (#39 ) * Use new APIs with k2.RaggedTensor * Fix style issues. * Update the installation doc, saying it requires at least k2 v1.7 * Extract framewise alignment information using CTC decoding. * Print environment information. Print information about k2, lhotse, PyTorch, and icefall. * Fix CI. * Fix CI. * Compute framewise alignment information of the LibriSpeech dataset. * Update comments for the time to compute alignments of train-960. * Preserve cut id in mix cut transformer. * Minor fixes. * Add doc about how to extract framewise alignments.	2021-10-18 14:24:33 +08:00
Mingshuang Luo	597c5efdb1	Use LossRecord to record and print the loss for the training process (#62 ) * Update index.rst (AS->ASR) * Update conformer_ctc.rst (pretraind->pretrained) * Fix some spelling errors. * Fix some spelling errors. * Use LossRecord to record and print loss in the training process * Change the name "LossRecord" to "MetricsTracker"	2021-10-12 15:58:03 +08:00
Fangjun Kuang	707d7017a7	Support pure ctc decoding requiring neither a lexicon nor an n-gram LM (#58 ) * Rename lattice_score_scale to nbest_scale. * Support pure CTC decoding requiring neither a lexicion nor an n-gram LM. * Fix style issues. * Fix a typo. * Minor fixes.	2021-09-26 14:21:49 +08:00
Fangjun Kuang	455693aede	Fix `hasattr` of AttributeDict. (#52 )	2021-09-22 16:37:20 +08:00
Fangjun Kuang	a80e58e15d	Refactor decode.py to make it more readable and more modular. (#44 ) * Refactor decode.py to make it more readable and more modular. * Fix an error. Nbest.fsa should always have token IDs as labels and word IDs as aux_labels. * Add nbest decoding. * Compute edit distance with k2. * Refactor nbest-oracle. * Add rescore with nbest lists. * Add whole-lattice rescoring. * Add rescoring with attention decoder. * Refactoring. * Fixes after refactoring. * Fix a typo. * Minor fixes. * Replace [] with () for shapes. * Use k2 v1.9 * Use Levenshtein graphs/alignment from k2 v1.9 * [doc] Require k2 >= v1.9 * Minor fixes.	2021-09-20 15:44:54 +08:00
Fangjun Kuang	cc77cb3459	Fix decode.py to remove the correct axis. (#50 ) * Fix decode.py to remove the correct axis. * Run GitHub actions manually.	2021-09-17 16:49:03 +08:00
Wei Kang	9a6e0489c8	update api for RaggedTensor (#45 ) * Fix code style * update k2 version in CI * fix compile hlg	2021-09-14 16:39:56 +08:00
Fangjun Kuang	abadc71415	Use new APIs with k2.RaggedTensor (#38 ) * Use new APIs with k2.RaggedTensor * Fix style issues. * Update the installation doc, saying it requires at least k2 v1.7 * Use k2 v1.7	2021-09-08 14:55:30 +08:00
Fangjun Kuang	1bd5dcc8ac	WIP: Add doc for the LibriSpeech recipe. (#24 ) * WIP: Add doc for the LibriSpeech recipe. * Add more doc for LibriSpeech recipe. * Add more doc for the LibriSpeech recipe. * More doc.	2021-08-24 20:28:32 +08:00

1 2 3 4

166 Commits