Landmark release using Sparse and Linear Attention for 1M-token context windows on a single consumer-grade smartphone.

Model Details

Context window 1,000,000
on-devicescalingefficiency