Commercial-grade, lightweight (1B) end-to-end OCR expert VLM. Connects a 0.4B native-resolution ViT to a 0.5B Hunyuan LLM. Won first place in ICDAR 2025 DIMT Challenge (Small Model Track).

Model Details

Architecture DENSE
Parameters 1B

Paper

arXiv: 2511.19575

multimodalvisionopen-weight