AMP | 安橙的博客

拆开混合精度训练:FP32 / TF32 / FP16 / BF16 / FP8 的数值范围、精度与溢出风险;理解 torch.amp.autocast 的算子选择、GradScaler 的动态 loss scaling,并跑通一个可对比的 AMP benchmark。