1 article found
The new Medusa-style MTP support in llama.cpp beta isn’t just catching up, it threatens to rewrite the economics of local model serving.