Fully asynchronous reinforcement learning system for training large reasoning and agentic models. Achieves up to 2.77x training speedup compared to synchronous systems. Used to train the Ring thinking model series.

Paper

arXiv: 2505.24298

Library

GitHub Repository

trainingframeworkefficiency

Related