non stable training in some scenarios. The clamping range is set to (-10,2). Note that this change may cause unexpected effect if you resume training from a model that is trained without clamping.