General-purpose Visual Understanding Evaluation (G-VUE) is a comprehensive benchmark covering the full spectrum of visual cognitive abilities with four functional domains -- Perceive, Ground, Reason, and Act. The four domains are embodied in 11 carefully curated tasks, from 3D reconstruction to visual reasoning and manipulation.
2 PAPERS • NO BENCHMARKS YET