2 articles found
The new Medusa-style MTP support in llama.cpp beta isn’t just catching up, it threatens to rewrite the economics of local model serving.
Qwen 3.6 27B on consumer hardware is disrupting the SaaS subscription model. Here’s how, and why it’s a warning sign for cloud AI.