"From SWE-ZERO to SWE-HERO: Execution-free to Execution-based Fine-tuning for Software Engineering Agents." Two-stage SFT approach: (1) large-scale execution-free trajectories, (2) targeted refinement with execution feedback. SWE-HERO-32B achieves 62.2% on SWE-bench Verified.

Released 300K first-stage and 13K second-stage trajectories distilled from a larger model. Despite training exclusively on Python, demonstrates cross-language generalization at 44.1% on multilingual benchmarks. By Ludwig, Ahmad, Majumdar, and Ginsburg.

Paper

codingagentsresearch