Autonomous Agentic Data Engineering

"Exploring Autonomous Agentic Data Engineering for Model Specialization." Formalizes a new task — letting an LLM agent autonomously execute the entire end-to-end data-curation loop for fine-tuning a student model on a target domain. The agent plans the curriculum, generates training data, runs the SFT step, observes the student's downstream performance, and iteratively refines the data mixture without any human-designed workflow.

Evaluation uses Llama-3.1-8B-Instruct as the student and Qwen3-30B-A3B as the teacher across Science, Code, and Finance domains, with seven frontier LLMs (GPT-5.2, Qwen3-Max, DeepSeek-R1, DeepSeek-V3.1, Gemini-2.5 Pro, Claude-4 Sonnet, and others) compared as the data-engineering agent. GPT-5.2 yields a 57.29% average relative gain over the base student in the multi-iteration closed-loop setting — the strongest of the agents tested.

Joint work by Zhejiang University and Tencent Platform & Content Group (PCG). Code release at github.com/zjunlp. CC BY-NC-SA 4.0.

Paper (arXiv)

Paper

arXiv HTML

Authors: Yujie Luo · Xiangyuan Ru · Jingsheng Zheng · Jingjing Wang · Yuqi Zhu · Jintian Zhang · Runnan Fang · Kewei Xu · Ye Liu · Zheng Wei · Jiang Bian · Zang Li · Shumin Deng

agentictrainingresearch

Your notes

Paper