Massive high-quality multimodal pre-training corpus containing over 2TB of English and Chinese text, image-text pairs, and video.
training-datatrainingmultimodal

Related

Notes

arXiv submission Aug 21, 2023.