GPT-4o

GPT-4o ("omni") — OpenAI's first natively multimodal model, processing text, audio, image, and video inputs and generating text and audio outputs within a single end-to-end architecture. 128K token context. Parameters undisclosed.

GPT-4o matched GPT-4 Turbo on text intelligence while being 2x faster and 50% cheaper. Its native audio capabilities enabled real-time voice conversation with emotional expression and multilingual support. Also released as GPT-4o mini (July 2024), a cost-optimized variant. AA Intelligence Index: 11. Proprietary.

Announcement Artificial Analysis OpenRouter

Model Details

Parameters (est.) ~ 720B

Context window 128,000

AA Intelligence 11

frontiermultimodalspeech

Your notes

Model Details

Related