First Frame Is the Place to Go for Video Content Customization

Published in CVPR 2026, 2026

We reveal that video models implicitly treat the first frame as a conceptual memory buffer for visual entities, enabling robust video content customization with only 20-50 training examples.

arXiv

Project Page

Recommended citation: Jingxi Chen*, Zongxia Li*, Zhichao Liu, Guangyao Shi, Xiyang Wu, Fuxiao Liu, Cornelia Fermüller, Brandon Y. Feng, Yiannis Aloimonos. (2026). "First Frame Is the Place to Go for Video Content Customization." CVPR.
Download Paper

Share on

Bluesky Facebook LinkedIn Mastodon X (formerly Twitter)