Google’s Sequential Attention: A New Paradigm for Efficient Transformers
Google Research’s Sequential Attention challenges the parallel processing orthodoxy in transformers, promising leaner models without accuracy loss. But is sequential processing a breakthrough or a bottleneck?