Karel Vesely 693f069de7
zipformer/ctc_align.py (#2020)
* zipformer/ctc_align.py

- tool for forced-alignment with CTC model
- provides timeline, computes per-token and per-utterance acoustic confidences
- based on torchaudio `forced_align()`
- confidences are computed in several ways

other modifications:
- LibriSpeechAsrDataModel extended with `::load_manifest()` to allow
  passing-in cutset from CLI.
- update @custom_fwd @custom_bwd in scaling.py
- streaming_decode.py update errs/recogs/log filenames '-' <-> '_'

* putting back `custom_bwd`, `custom_fwd`

* integrating remarks from PR

* update of argparse help strings

* ctc_align.py, avoid shadowing a variable

* Finalizing the code:

- adding some coderabbit suggestions.
- removing `word_table`, `decoding_graph` from aligner API (unused)
- improved consistency of variable names (confidences)
- updated docstrings
2025-10-06 07:49:37 +08:00
..
2025-10-06 07:49:37 +08:00
2025-07-01 13:47:55 +08:00
2025-07-01 13:47:55 +08:00
2023-08-09 09:40:58 +08:00
2025-07-01 13:47:55 +08:00
2023-10-24 08:17:17 +08:00
2025-07-01 13:47:55 +08:00
2025-07-01 13:47:55 +08:00
2025-07-01 13:47:55 +08:00
2024-12-30 15:30:02 +08:00
2025-07-01 13:47:55 +08:00
2025-07-01 13:47:55 +08:00
2025-07-01 13:47:55 +08:00
2025-09-22 09:58:00 +08:00
2025-07-01 13:47:55 +08:00