Experimental diffusion-based language model capable of inference speeds exceeding 2,100 tokens/s.
generationefficiencyresearch