mirrors/icefall

mirror of https://github.com/k2-fsa/icefall.git synced 2025-08-09 10:02:22 +00:00

History

Fangjun Kuang f731996abe Use torchaudio to extract features.

2021-08-18 19:31:06 +08:00

..

__init__.py

WIP: Begin to add BPE decoding

2021-07-26 20:06:58 +08:00

conformer.py

Refactoring (#4 )

2021-08-04 14:53:02 +08:00

decode.py

Add scale to all nbest based decoding/rescoring methods.

2021-08-18 18:42:30 +08:00

pretrained.py

Use torchaudio to extract features.

2021-08-18 19:31:06 +08:00

README.md

Use torchaudio to extract features.

2021-08-18 19:31:06 +08:00

subsampling.py

Refactoring (#4 )

2021-08-04 14:53:02 +08:00

test_subsampling.py

Refactoring (#4 )

2021-08-04 14:53:02 +08:00

test_transformer.py

Refactoring (#4 )

2021-08-04 14:53:02 +08:00

train.py

Add doc about installation and usage (#7 )

2021-08-12 12:44:04 +08:00

transformer.py

Add doc about installation and usage (#7 )

2021-08-12 12:44:04 +08:00

README.md

How to use a pre-trained model to transcript a sound file

You need to prepare 4 files:

a model checkpoint file, e.g., epoch-20.pt
HLG.pt, the decoding graph
words.txt, the word symbol table
a sound file, whose sampling rate has to be 16 kHz Supported formats are those supported by torchaudio.load(), e.g., wav and flac.

Once you have the above files ready, you can run:

./conformer_ctc/pretrained.py \
  --checkpoint /path/to/your/checkpoint.pt \
  --words-file /path/to/words.txt \
  --hlg /path/to/HLG.pt \
  --sound-file /path/to/your/sound.wav

and you will see the transcribed result.