AV Digits Database

Introduced by Petridis et al. in Visual-Only Recognition of Normal, Whispered and Silent Speech

AV Digits Database is an audiovisual database which contains normal, whispered and silent speech. 53 participants were recorded from 3 different views (frontal, 45 and profile) pronouncing digits and phrases in three speech modes.

The database consists of two parts: digits and short phrases. In the first part, participants were asked to read 10 digits, from 0 to 9, in English in random order five times. In case of non-native English speakers this part was also repeated in the participant’s native language. In total, 53 participants (41 males and 12 females) from 16 nationalities, were recorded with a mean age and standard deviation of 26.7 and 4.3 years, respectively.

In the second part, participants were asked to read 10 short phrases. The phrases are the same as the ones used in the OuluVS2 database: “Excuse me”, “Goodbye”, “Hello”, “How are you”, “Nice to meet you”, “See you”, “I am sorry”, “Thank you”, “Have a good time”, “You are welcome”. Again, each phrase was repeated five times in 3 different modes, neutral, whisper and silent speech. Thirty nine participants (32 males and 7 females) were recorded for this part with a mean age and standard deviation of 26.3 and 3.8 years, respectively.

Source: AV Digits Database

Papers


Paper Code Results Date Stars

Dataset Loaders


No data loaders found. You can submit your data loader here.

Tasks


License


  • Unknown

Modalities


Languages