2 articles found
Unsloth and TorchAO bring FP8 reinforcement learning to consumer GPUs, cutting VRAM needs by 60% while delivering 1.4x speedups. Can your local hardware really train competitive reasoning models now?
Cerebras releases REAP-pruned GLM-4.6 variants at 25%, 30%, and 40% sparsity with FP8 quantization – but do they actually work?