* Add fast_beam_search_nbest.
* Fix CI errors.
* Fix CI errors.
* More fixes.
* Small fixes.
* Support using log_add in LG decoding with fast_beam_search.
* Support LG decoding in pruned_transducer_stateless
* Support LG for pruned_transducer_stateless2.
* Support LG for fast beam search.
* Minor fixes.
* initial commit
* support download, data prep, and fbank
* on-the-fly feature extraction by default
* support BPE based lang
* support HLG for BPE
* small fix
* small fix
* chunked feature extraction by default
* Compute features for GigaSpeech by splitting the manifest.
* Fixes after review.
* Split manifests into 2000 pieces.
* set audio duration mismatch tolerance to 0.01
* small fix
* add conformer training recipe
* Add conformer.py without pre-commit checking
* lazy loading and use SingleCutSampler
* DynamicBucketingSampler
* use KaldifeatFbank to compute fbank for musan
* use pretrained language model and lexicon
* use 3gram to decode, 4gram to rescore
* Add decode.py
* Update .flake8
* Delete compute_fbank_gigaspeech.py
* Use BucketingSampler for valid and test dataloader
* Update params in train.py
* Use bpe_500
* update params in decode.py
* Decrease num_paths while CUDA OOM
* Added README
* Update RESULTS
* black
* Decrease num_paths while CUDA OOM
* Decode with post-processing
* Update results
* Remove lazy_load option
* Use default `storage_type`
* Keep the original tolerance
* Use split-lazy
* black
* Update pretrained model
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
* Update RESULTS using vocab size 500, att rate 0.8
* Update README.
* Refactoring.
Since FSAs in an Nbest object are linear in structure, we can
add the scores of a path to compute the total scores.
* Update documentation.
* Change default vocab size from 5000 to 500.
* Add a note about the CUDA OOM error.
Some users consider this kind of OOM as an error during decoding,
but actually it is not. This pull request clarifies that.
* Fix style issues.
* Rename lattice_score_scale to nbest_scale.
* Support pure CTC decoding requiring neither a lexicion nor an n-gram LM.
* Fix style issues.
* Fix a typo.
* Minor fixes.
* Refactor decode.py to make it more readable and more modular.
* Fix an error.
Nbest.fsa should always have token IDs as labels and
word IDs as aux_labels.
* Add nbest decoding.
* Compute edit distance with k2.
* Refactor nbest-oracle.
* Add rescore with nbest lists.
* Add whole-lattice rescoring.
* Add rescoring with attention decoder.
* Refactoring.
* Fixes after refactoring.
* Fix a typo.
* Minor fixes.
* Replace [] with () for shapes.
* Use k2 v1.9
* Use Levenshtein graphs/alignment from k2 v1.9
* [doc] Require k2 >= v1.9
* Minor fixes.
* Support computing nbest oracle WER.
* Add scale to all nbest based decoding/rescoring methods.
* Add script to run pretrained models.
* Use torchaudio to extract features.
* Support decoding multiple files at the same time.
Also, use kaldifeat for feature extraction.
* Support decoding with LM rescoring and attention-decoder rescoring.
* Minor fixes.
* Replace scale with lattice-score-scale.
* Add usage example with a provided pretrained model.