Step-3-VL-10B
model paper"Compact giant" outperforming models 20x its size (including GPT-4o on certain benchmarks) via fully unfrozen perception-decoder training.
Outputs 2
Step-3-VL-10B
model"Compact giant" outperforming models 20x its size (including GPT-4o on certain benchmarks) via fully unfrozen perception-decoder training.
Architecture DENSE
Parameters 10B
Released Jan 20, 2026 on HuggingFace.
Step3-VL-10B Technical Report
paperFocused on "Intrinsic Vision-Language Synergy."
arXiv: 2601.09668