The Economic Collapse of Cloud LLM APIs: Is Local Inference Still Viable?
Cloud LLM prices have cratered, Kimi K2.5 costs 10% of Opus, Gemini’s free tier is massive, and DeepSeek is nearly free. Meanwhile, running a 70B model locally still demands $1,000+ in hardware and 15 tok/s. The math has flipped, but the devil lives in the fine print.