MMStar is an elite vision-indispensable multi-modal benchmark comprising 1,500 meticulously selected samples. These samples are carefully balanced and purified, ensuring they exhibit visual dependency, minimal data leakage, and require advanced multi-modal capabilities. MMStar evaluates LVLMs across 6 core capabilities and 18 detailed axes.
Paper | Code | Results | Date | Stars |
---|