2 articles found
Google’s T5Gemma 2 merges tied embeddings and unified attention to deliver efficient multimodal AI that outperforms decoder-only models at a fraction of the parameter cost.
Moonshot AI’s hybrid architecture delivers 6x decoding speed with 75% less memory, making 1M-token contexts actually practical.