A Survey of State of the Art Large Vision Language Models: Benchmark Evaluations and Challenges

Published in CVPR Workshop 2025 (Oral), 2025

A comprehensive survey of VLMs covering model architectures, alignment methods, benchmarks, and challenges including hallucination, fairness, and safety.

PaperCode

Recommended citation: Zongxia Li*, Xiyang Wu*, Hongyang Du, Fuxiao Liu, Huy Nghiem, Guangyao Shi. (2025). "A Survey of State of the Art Large Vision Language Models: Benchmark Evaluations and Challenges." CVPR Workshop on Multimodal Learning (Oral).
Download Paper