Native bilingual Chinese-English image generation model with integrated LLM text encoder and character-level text rendering.

Paper

arXiv: 2503.07703

generationvision

Related