AI Watchdog: Panda-70M

Explore original journalism about this data set through AI Watchdog, The Atlantic’s ongoing investigation into the generative-AI industry.


Panda-70M is a collection of 3.8 million videos from YouTube split into approximately 70.7 million clips and paired with text captions. It was compiled by Snap, and released in 2024. Developers used AI to create a new set of captions describing what is pictured in each clip.