Tagged with

2 articles found

How a Single llama.cpp PR Just Fixed Agentic Coding’s Worst Performance Bottleneck

That dreaded ‘forcing full prompt re-processing’ message is getting retired. How Jacek Poplawski’s PR uses conversation boundaries to fix context management in llama.cpp.

#agentic coding#context optimization

agentic coding

3 Billion Active Parameters Just Challenged 30 Billion: Inside Qwen3.6’s Sparse MoE

Alibaba’s Qwen3.6-35B-A3B activates only 3B parameters per token yet claims agentic coding parity with models 10x its size. We dissect the architecture, benchmarks, and whether this Apache 2.0 release actually changes the local AI equation.

#agentic coding#alibaba#moe...