Han Zhu ab91112909
Improve infinity-check (#1862)
1. Attach the inf-check hooks if the grad scale is getting too small.
2. Add try-catch to avoid OOM in the inf-check hooks.
3. Set warmup_start=0.1 to reduce chances of divergence
2025-01-09 15:05:38 +08:00
..
2023-10-25 12:50:35 +08:00
2024-06-21 11:10:14 +08:00
2024-12-31 07:41:44 +08:00
2024-06-21 11:10:14 +08:00