BANANDRE
NO ONE CARES ABOUT CODE

Navigation

HomeCategories

Categories

Artificial Intelligence(619)
Software Architecture(314)
Software Development(293)
Data Engineering(174)
Engineering Management(88)
Enterprise Architecture(73)
Product Management(30)

Tagged with

#sparse-activation

1 article found

Step-3.5-Flash: The 196B Parameter Model That Makes Giants Look Wasteful
efficiency
Featured

Step-3.5-Flash: The 196B Parameter Model That Makes Giants Look Wasteful

Stepfun’s sparse MoE model activates only 11B parameters yet outperforms models 3-5x larger on coding and agentic tasks, delivering 100-300 tok/s on consumer hardware and forcing a reckoning with the parameter count arms race.

#efficiency#moe#sparse-activation...
Read More
BANANDRE
NO ONE CARES ABOUT CODE

Connect

2026 BANANDRE
Privacy PolicyTermsImpressum
Built with 🍌