Hardware-aligned and natively trainable sparse attention mechanism.

Paper

arXiv: 2502.11089

attentionarchitectureefficiency

More Links