Tagged with

5 articles found

MiniMax M2.5: The $1/Hour Model That Makes Claude Opus Look Overpriced

MiniMax’s 230B MoE model hits 80.2% on SWE-Bench at 1/20th the cost of competitors. Here’s why the AI pricing model just collapsed.

#ai-models#coding-agents#m2.5...

ai-research

DeepSeek’s 86-Page Flex: Technical Transparency or Academic Overkill?

DeepSeek-R1’s paper ballooned from 22 to 86 pages, revealing Manifold-Constrained Hyper-Connections and a radically transparent training pipeline. Is this the blueprint for cost-efficient AI or a masterclass in engineering theater?

#ai-research#deepseek#mhc...

generative-ai

GLM-TTS: The 3-Second Voice Cloner That Outperforms Commercial Systems (But Has a Contraction Problem)

Z.ai’s GLM-TTS combines zero-shot voice cloning, multi-reward RL emotion control, and bilingual streaming synthesis. The open-source model beats closed alternatives on accuracy, but real-world testing reveals sharp edges.

#generative-ai#reinforcement-learning#text-to-speech...

large-language-models

The Open-Source Tipping Point: INTELLECT-3 Proves 100B+ MoE Models Can Outperform Corporate Giants

Prime Intellect releases a 100B+ parameter Mixture-of-Experts model that beats larger frontier models in reasoning, math, and coding, and they’re giving away the entire training recipe.

#large-language-models#mixture-of-experts#open-source...

fp8

The FP8 Revolution: How Unsloth Just Democratized Reinforcement Learning

Unsloth and TorchAO bring FP8 reinforcement learning to consumer GPUs, cutting VRAM needs by 60% while delivering 1.4x speedups. Can your local hardware really train competitive reasoning models now?

#fp8#gpu-optimization#local-training...