Han Zhu ab91112909
Improve infinity-check (#1862)
1. Attach the inf-check hooks if the grad scale is getting too small.
2. Add try-catch to avoid OOM in the inf-check hooks.
3. Set warmup_start=0.1 to reduce chances of divergence
2025-01-09 15:05:38 +08:00
..
2024-10-30 21:14:12 +08:00
2024-05-22 22:29:38 +08:00
2025-01-09 15:05:38 +08:00
2022-11-17 09:42:17 -05:00
2022-12-11 21:30:39 +08:00