2 articles found
A 9B-parameter model achieving six times the throughput of a 70B-parameter competitor raises questions about architectural innovation versus hardware dependency.
Meta's new 1B foundational model outperforms Gemma and Llama benchmarks while fitting in your pocket. But is distilled intelligence the future?