Video-based Generative Performance Benchmarking