Open-source spatial intelligence model family built on multimodal foundations (Qwen3-VL, InternVL3). Curated SenseNova-SI-8M dataset of 8 million spatial samples. SenseNova-SI-8B outperforms GPT-5 and Gemini-3-Pro on spatial benchmarks including VSI-Bench (68.7%), MMSI (43.3%), MindCube (85.6%). Accepted at CVPR 2026.

Outputs 3

SenseNova-SI Models

model

Variants

Name Parameters Notes
SenseNova-SI-1.1-Qwen2.5-VL-3B 3B
SenseNova-SI-1.1-Qwen2.5-VL-7B 7B
SenseNova-SI-1.1-InternVL3-2B 2B
SenseNova-SI-1.1-InternVL3-8B 8B
SenseNova-SI-1.2-InternVL3-8B 8B State-of-the-art among open-source spatial models

Scaling Spatial Intelligence with Multimodal Foundation Models

paper

arXiv: 2511.13719

Venue: CVPR 2026

SenseNova-SI-8M Dataset

dataset

8 million diverse spatial data samples under a rigorous taxonomy of spatial capabilities for training spatial intelligence models.

spatial-intelligenceopen-weightvisionmultimodal