- torch.cuda.amp.GradScaler(...) => torch.amp.GradScaler("cuda", ...) - torch.cuda.amp.autocast(...) => torch.amp.autocast("cuda", ...)
grad_scale is too small
* `pruned_transudcer_stateless7_streaming` for AISHELL-1 * Update train.py * Update train2.py * Update decode.py * Update RESULTS.md