MiniCPM-V
model paperVision-language model series achieving GPT-4V level performance on mobile devices. The first multimodal model deployed natively on a smartphone. Progressed from V 2.0 through V 2.6 with world-class OCR and video understanding.
Outputs 4
MiniCPM-Llama3-V 2.5
modelGPT-4V level performance in a 9B parameter package with world-class OCR capabilities.
Parameters 9B
MiniCPM-V 2.6
modelIntroduced multi-image and video understanding, outperforming GPT-4V on major benchmarks.