RLAIF-V

"RLAIF-V: Open-Source AI Feedback Leads to Super GPT-4V Trustworthiness." Aligns multimodal LLMs through fully open-source AI feedback, combining high-quality preference data generation with self-feedback inference-time scaling. RLAIF-V-7B reduces object hallucination by 80.7% and overall hallucination by 33.7%; RLAIF-V-12B achieves trustworthiness above GPT-4V via self-alignment.

Cited as the alignment recipe behind MiniCPM-o's safety profile. CVPR 2025 highlight. Joint work with Tsinghua (NLP Lab), Shanghai Qi Zhi Institute, HIT, Alibaba Taobao & Tmall, PCL, and NUS.

Paper (arXiv)GitHub HuggingFace (RLAIF-V-12B)HuggingFace dataset

Paper

Venue CVPR 2025 (highlight)

Citations 3

arXiv HTML

alignmentmultimodalfoundational

Your notes

Paper

Related