AI Watchdog: InternVid
Explore original journalism about this data set through AI Watchdog, The Atlantic’s ongoing investigation into the generative-AI industry.
InternVid is a collection of 5,428,743 videos split into 234 million clips paired with automatically generated text captions. It was compiled by OpenGVLab, a research group at Shanghai AI Laboratory, and released in 2023. The videos’ content spans daily life, sports, entertainment, education, and more. The data set is hosted on Hugging Face, an AI-development hub, where it has been downloaded more than 3,200 times. It has been used by Stability AI and ByteDance for experimentation.