Active Learning for Video Description With Cluster-Regularized Ensemble Ranking

Automatic video captioning aims to train models to generate text descriptions for all segments in a video, however, the most effective approaches require large amounts of manual annotation which is slow and expensive. Active learning is a promising way to efficiently build a training set for video captioning tasks while reducing the need to manually label uninformative examples... (read more)

Results in Papers With Code
(↓ scroll down to see all results)