pkufool
507a450787
Update Readme
2021-11-18 15:58:05 +08:00
pkufool
5efe6373df
Fix unit test
2021-11-18 15:39:49 +08:00
pkufool
a1dd2921bb
Minor fixes
2021-11-18 15:11:51 +08:00
pkufool
0df945ad3d
Add C++ deployment docs
2021-11-18 15:08:43 +08:00
pkufool
83e6265f79
Export torch script model for Aishell
2021-11-18 12:52:26 +08:00
pkufool
8f91ed2fbe
Merge branch 'master' of github.com:pkufool/icefall into aishell
2021-11-18 10:58:03 +08:00
Wei Kang
30c43b7f69
Add aishell recipe ( #30 )
...
* Add aishell recipe
* Remove unnecessary code and update docs
* adapt to k2 v1.7, add docs and results
* Update conformer ctc model
* Update docs, pretrained.py & results
* Fix code style
* Fix code style
* Fix code style
* Minor fix
* Minor fix
* Fix pretrained.py
* Update pretrained model & corresponding docs
2021-11-18 10:00:47 +08:00
pkufool
856bb7f285
Update pretrained model & corresponding docs
2021-11-18 07:57:25 +08:00
pkufool
d57a8737b6
Fix pretrained.py
2021-11-17 19:53:25 +08:00
pkufool
a5cb6f7838
Minor fix
2021-11-17 19:46:48 +08:00
pkufool
b97ab74649
Minor fix
2021-11-17 19:44:02 +08:00
Fangjun Kuang
0660d12e4e
Fix computing WERs for empty hypotheses ( #118 )
...
* Fix computing WERs when empty lattices are generated.
* Minor fixes.
2021-11-17 19:25:47 +08:00
Fangjun Kuang
336283f872
New label smoothing ( #109 )
...
* Modify label smoothing to match the one implemented in PyTorch.
* Enable CI for torch 1.10
* Fix CI errors.
* Fix CI installation errors.
* Fix CI installation errors.
* Minor fixes.
* Minor fixes.
* Minor fixes.
* Minor fixes.
* Minor fixes.
* Fix CI errors.
2021-11-17 19:24:07 +08:00
pkufool
73ad3e3101
Fix code style
2021-11-17 19:16:39 +08:00
pkufool
f7a26400ab
Fix code style
2021-11-17 19:09:36 +08:00
pkufool
99b39bccce
Fix conflicts with origin master
2021-11-17 19:04:35 +08:00
pkufool
ebf142cb98
Fix code style
2021-11-17 19:02:44 +08:00
Mingshuang Luo
10e46f3e1d
A little changes for timit recipe ( #122 )
...
* Update train.py
* Update train.py
* Update train.py
* Update tdnn_ligru_ctc.rst
2021-11-17 16:13:51 +08:00
Mingshuang Luo
2e0f255ada
Add timit recipe (including the code scripts and the docs) for icefall ( #114 )
...
* add timit recipe for icefall
* add shared file
* update the docs for timit recipe
* Delete shared
* update the timit recipe and check style
* Update model.py
* Do some changes
* Update model.py
* Update model.py
* Add README.md and RESULTS.md
* Update RESULTS.md
* Update README.md
* update the docs for timit recipe
2021-11-17 11:23:45 +08:00
Fangjun Kuang
68506609ad
Set fsa.properties to None after changing its labels in-place. ( #121 )
2021-11-16 23:11:30 +08:00
pkufool
cbc5557c87
Update docs, pretrained.py & results
2021-11-16 12:32:51 +08:00
pkufool
943244642f
Update conformer ctc model
2021-11-16 10:25:30 +08:00
pkufool
8666b49863
Merge branch 'master' of github.com:pkufool/icefall into aishell
2021-11-15 15:13:53 +08:00
Daniel Povey
b9452235d5
Merge pull request #117 from csukuangfj/fix-empty-lattice
...
Handle empty lattices in attention decoder rescoring.
2021-11-11 16:26:02 +08:00
Fangjun Kuang
5b10310bd1
Handle empty lattices in attention decoder rescoring.
2021-11-11 15:42:30 +08:00
Fangjun Kuang
8d679c3e74
Fix typos. ( #115 )
2021-11-10 14:45:30 +08:00
Fangjun Kuang
21096e99d8
Update result for the librispeech recipe using vocab size 500 and att rate 0.8 ( #113 )
...
* Update RESULTS using vocab size 500, att rate 0.8
* Update README.
* Refactoring.
Since FSAs in an Nbest object are linear in structure, we can
add the scores of a path to compute the total scores.
* Update documentation.
* Change default vocab size from 5000 to 500.
2021-11-10 14:32:52 +08:00
Fangjun Kuang
04029871b6
Fix a bug in Nbest.compute_am_scores and Nbest.compute_lm_scores. ( #111 )
2021-11-09 13:44:51 +08:00
Fangjun Kuang
91cfecebf2
Remove duplicated token seq in rescoring. ( #108 )
...
* Remove duplicated token seq in rescoring.
* Use a larger range for ngram_lm_scale and attention_scale
2021-11-06 08:54:45 +08:00
pkufool
d9c8c73bf0
Merge branch 'master' of github.com:pkufool/icefall into aishell
2021-11-05 17:26:23 +08:00
Fangjun Kuang
810b193dcc
Clarify the doc about ctc-decoding. ( #104 )
2021-11-03 07:16:49 +08:00
Fangjun Kuang
42b437bea6
Use pre-sorted text to generate token ids for attention decoder. ( #98 )
...
* Use pre-sorted text to generate token ids for attention decoder.
See https://github.com/k2-fsa/icefall/issues/97
for more details.
* Fix typos.
2021-10-29 13:46:41 +08:00
Fangjun Kuang
12d647d899
Add a note about the CUDA OOM error. ( #94 )
...
* Add a note about the CUDA OOM error.
Some users consider this kind of OOM as an error during decoding,
but actually it is not. This pull request clarifies that.
* Fix style issues.
2021-10-29 12:17:56 +08:00
Fangjun Kuang
8cb7f712e4
Use GPU for averaging checkpoints if possible. ( #84 )
2021-10-26 17:10:04 +08:00
Fangjun Kuang
712ead8207
Fix an error when attention decoder rescoring returns None. ( #90 )
2021-10-22 19:52:25 +08:00
Piotr Żelasko
902e0b238d
Merge pull request #82 from pzelasko/feature/find-pessimistic-batches
...
Find CUDA OOM batches before starting training
2021-10-19 11:26:13 -04:00
Piotr Żelasko
3cc99d2af2
make flake8 happy
2021-10-19 11:24:54 -04:00
cdxie
d30244e28f
add a docker file for some users ( #87 )
...
* add a docker file for some users
Ubuntu18.04-pytorch1.7.1-cuda11.0-cudnn8-python3.8
* add a describing file of how to use dockerfile
give some steps to use dockerfile
2021-10-19 13:00:59 +08:00
Piotr Żelasko
86f3e0ef37
Make flake8 happy
2021-10-18 09:54:40 -04:00
Piotr Żelasko
6fbd7a287c
Refactor OOM batch scanning into a local function
2021-10-18 09:53:04 -04:00
Piotr Żelasko
d509d58f30
Merge branch 'master' into feature/find-pessimistic-batches
2021-10-18 09:47:21 -04:00
Fangjun Kuang
3effcb4225
Fix typos. ( #85 )
2021-10-18 16:17:14 +08:00
Fangjun Kuang
53b79fafa7
Add MMI training with word pieces as modelling unit. ( #6 )
...
* Fix an error in TDNN-LSTM training.
* WIP: Refactoring
* Refactor transformer.py
* Remove unused code.
* Minor fixes.
* Fix decoder padding mask.
* Add MMI training with word pieces.
* Remove unused files.
* Minor fixes.
* Refactoring.
* Minor fixes.
* Use pre-computed alignments in LF-MMI training.
* Minor fixes.
* Update decoding script.
* Add doc about how to check and use extracted alignments.
* Fix style issues.
* Fix typos.
* Fix style issues.
* Disable macOS tests for now.
2021-10-18 15:20:32 +08:00
Fangjun Kuang
4890e27b45
Extract framewise alignment information using CTC decoding ( #39 )
...
* Use new APIs with k2.RaggedTensor
* Fix style issues.
* Update the installation doc, saying it requires at least k2 v1.7
* Extract framewise alignment information using CTC decoding.
* Print environment information.
Print information about k2, lhotse, PyTorch, and icefall.
* Fix CI.
* Fix CI.
* Compute framewise alignment information of the LibriSpeech dataset.
* Update comments for the time to compute alignments of train-960.
* Preserve cut id in mix cut transformer.
* Minor fixes.
* Add doc about how to extract framewise alignments.
2021-10-18 14:24:33 +08:00
Jan "yenda" Trmal
bd7c2f7645
fix conformer typo in docs ( #83 )
2021-10-16 07:46:17 +08:00
Piotr Żelasko
403d1744ff
Introduce backprop in finding OOM batches
2021-10-15 10:05:13 -04:00
Piotr Żelasko
060117a9ff
Reformatting
2021-10-14 21:40:14 -04:00
Piotr Żelasko
1c7c79f2fc
Find CUDA OOM batches before starting training
2021-10-14 21:28:11 -04:00
Fangjun Kuang
fee1f84b20
Test pre-trained model in CI ( #80 )
...
* Add CI to run pre-trained models.
* Minor fixes.
* Install kaldifeat
* Install a CPU version of PyTorch.
* Fix CI errors.
* Disable decoder layers in pretrained.py if it is not used.
* Clone pre-trained model from GitHub.
* Minor fixes.
* Minor fixes.
* Minor fixes.
2021-10-15 00:41:33 +08:00
Mingshuang Luo
5401ce199d
Update ctc-decoding on pretrained.py and conformer_ctc.rst ( #78 )
2021-10-14 23:29:06 +08:00