- tool for forced-alignment with CTC model
- provides timeline, computes per-token and per-utterance acoustic confidences
- based on torchaudio `forced_align()`
- confidences are computed in several ways
other modifications:
- LibriSpeechAsrDataModel extended with `::load_manifest()` to allow
passing-in cutset from CLI.
- update @custom_fwd @custom_bwd in scaling.py
- streaming_decode.py update errs/recogs/log filenames '-' <-> '_'
- Introduce unified AMP helpers (create_grad_scaler, torch_autocast) to handle
deprecations in PyTorch ≥2.3.0
- Replace direct uses of torch.cuda.amp.GradScaler and torch.cuda.amp.autocast
with the new utilities across all training and inference scripts
- Update all torch.load calls to include weights_only=False for compatibility with
newer PyTorch versions
* fixes for `diagnostics`
Replace `2 ** 22` with `512` as the default value of `diagnostics.TensorDiagnosticOptions`
also black formatted some scripts
* fixed formatting issues
* add the zipformer codes, copied from branch from_dan_scaled_adam_exp1119
* support model export with torch.jit.script
* update RESULTS.md
* support exporting streaming model with torch.jit.script
* add results of streaming models, with some minor changes
* update README.md
* add CI test
* update k2 version in requirements-ci.txt
* update pyproject.toml