CogView2
modelA hierarchical transformer-based text-to-image model (6B parameters) that introduced the Cross-modal General Language Model (CogLM) and Local Parallel Autoregressive (LoPAR) generation. CogView2 significantly improved generation speed (up to 10x faster than original CogView) and enabled high-resolution synthesis and interactive image editing.
Paper
arXiv: 2204.14217