First Frame Is the Place to Go for Video Content Customization

Published in CVPR 2026, 2026

We reveal that video models implicitly treat the first frame as a conceptual memory buffer for visual entities, enabling robust video content customization with only 20-50 training examples.

arXivProject Page

Recommended citation: Jingxi Chen*, Zongxia Li*, Zhichao Liu, Guangyao Shi, Xiyang Wu, Fuxiao Liu, Cornelia Fermüller, Brandon Y. Feng, Yiannis Aloimonos. (2026). "First Frame Is the Place to Go for Video Content Customization." CVPR.
Download Paper