GLM-4.5V
modelOpen-source vision-language model (106B total, 12B active via MoE). Supports 64K tokens for multi-image and video inputs. SOTA on 42 public VL benchmarks at its scale. MIT-licensed.
Model Details
Architecture MOE
Parameters 106B
Active params 12B
Context window 64,000