Fixes incorrect computation of encoder_dim when encoder_dim is a comma-separated list of integers by ensuring numeric (not lexicographic) max is used.
Fixes#2018
- Replace int(max(params.encoder_dim.split(","))) (lexicographic max on strings) with max(_to_int_tuple(params.encoder_dim)) (numeric max).
- Apply the fix consistently across all affected training scripts.
- Introduce unified AMP helpers (create_grad_scaler, torch_autocast) to handle
deprecations in PyTorch ≥2.3.0
- Replace direct uses of torch.cuda.amp.GradScaler and torch.cuda.amp.autocast
with the new utilities across all training and inference scripts
- Update all torch.load calls to include weights_only=False for compatibility with
newer PyTorch versions