StepFun's new flagship open-weight reasoning VLM. 198B total / ~11B active sparse MoE with 256K context, tunable reasoning tiers, and ~400 tok/s output. Apache 2.0 across BF16 / FP8 / NVFP4 / GGUF precision variants on HuggingFace.

Self-reported benchmarks: SimpleVQA 79.2, ClawEval-1.1 67.1, SWE-Bench Pro 56.3 (#2), competitive with Gemini 3 Flash on multiple temporal tracks. Successor in the closed-flagship slot to Step-3.5-Flash and the open Step-3 line.

Model Details

Architecture MOE
Parameters 198B
Active params 11B
Context window 262,144
License Apache 2.0
frontierreasoningmultimodalmoeopen-weight

Related