Fine-tuning Siamese Networks to Assess Sport Gestures Quality

This paper presents an Action Quality Assessment (AQA) approach that learns to automatically score action realization from temporal sequences like videos. To manage the small size of most of databases capturing actions or gestures, we propose to use Siamese Networks. In the literature, Siamese Networks are widely used to rank action scores. Indeed, their purpose is not to regress scores but to predict a value that respects true scores order so that it can be used to rank actions according to their quality. For AQA, we need to predict real scores, as well as the difference between these scores and their range. Thus, we first introduce a new loss function to train Siamese Networks in order to regress score gaps. Once the Siamese network is trained, a branch of this network is extracted and fine-tuned for score prediction. We tested our approach on a public database, the AQA-7 dataset, composed of videos from 7 sports. Our results outperform state of the art on AQA task. Moreover, we show that the proposed method is also more efficient for action ranking.

PDF

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods