Tagged with

1 article found

The Fork That Finally Forked Back: llama.cpp Adopts ik_llama’s Secret Quantization Sauce

A controversial PR ports advanced IQ*_K quantization methods from the ik_llama.cpp fork into mainline llama.cpp, promising smaller models and better edge performance, but not without drama over code ownership and MIT license politics.

#ik_llama.cpp#llama.cpp#model-compression...