Paper tables with annotated results for Signs in time: Encoding human motion as a temporal image

Paper

Signs in time: Encoding human motion as a temporal image

The goal of this work is to recognise and localise short temporal signals in image time series, where strong supervision is not available for training. To this end we propose an image encoding that concisely represents human motion in a video sequence in a form that is suitable for learning with a ConvNet. The encoding reduces the pose information from an image to a single column, dramatically diminishing the input requirements for the network, but retaining the essential information for recognition. The encoding is applied to the task of recognizing and localizing signed gestures in British Sign Language (BSL) videos. We demonstrate that using the proposed encoding, signs as short as 10 frames duration can be learnt from clips lasting hundreds of frames using only weak (clip level) supervision and with considerable label noise.

PDF Paper record

Results in Papers With Code

(↓ scroll down to see all results)

Signs in time: Encoding human motion as a temporal image

Reader Guidelines

Editor Guidelines