Tagged with

2 articles found

The FP8 Revolution: How Unsloth Just Democratized Reinforcement Learning

Unsloth and TorchAO bring FP8 reinforcement learning to consumer GPUs, cutting VRAM needs by 60% while delivering 1.4x speedups. Can your local hardware really train competitive reasoning models now?

#fp8#gpu-optimization#local-training...

cerebras

Pruning MoE Models: The Art of Cutting Complexity Without Losing Brains

Cerebras releases REAP-pruned GLM-4.6 variants at 25%, 30%, and 40% sparsity with FP8 quantization – but do they actually work?

#cerebras#fp8#llm-compression...