Search Results for author: Brendan Shillingford

Found 14 papers, 5 papers with code

Composing RNNs and FSTs for Small Data: Recovering Missing Characters in Old Hawaiian Text

no code implementations • 24 Jul 2022 • Oiwi Parker Jones, Brendan Shillingford

In contrast to the older writing system of the 19th century, modern Hawaiian orthography employs characters for long vowels and glottal stops.

Reading Comprehension Transliteration

Paper
Add Code

Restoring and attributing ancient texts using deep neural networks

2 code implementations • Nature 2022 • Yannis Assael, Thea Sommerschield, Brendan Shillingford, Mahyar Bordbar, John Pavlopoulos, Marita Chatzipanagiotou, Ion Androutsopoulos, Jonathan Prag, Nando de Freitas

Ithaca can attribute inscriptions to their original location with an accuracy of 71% and can date them to less than 30 years of their ground-truth ranges, redating key texts of Classical Athens and contributing to topical debates in ancient history.

Ranked #1 on Ancient Text Restoration on I.PHI

Ancient Text Restoration Attribute

538

Paper
Code

More than Words: In-the-Wild Visually-Driven Prosody for Text-to-Speech

no code implementations • CVPR 2022 • Michael Hassid, Michelle Tadmor Ramanovich, Brendan Shillingford, Miaosen Wang, Ye Jia, Tal Remez

In this paper we present VDTTS, a Visually-Driven Text-to-Speech model.

Paper
Add Code

Interactive decoding of words from visual speech recognition models

no code implementations • 1 Jul 2021 • Brendan Shillingford, Yannis Assael, Misha Denil

This work describes an interactive decoding method to improve the performance of visual speech recognition systems using user input to compensate for the inherent ambiguity of the task.

Position speech-recognition +1

Paper
Add Code

Large-scale multilingual audio visual dubbing

no code implementations • 6 Nov 2020 • Yi Yang, Brendan Shillingford, Yannis Assael, Miaosen Wang, Wendi Liu, Yutian Chen, Yu Zhang, Eren Sezener, Luis C. Cobo, Misha Denil, Yusuf Aytar, Nando de Freitas

The visual content is translated by synthesizing lip movements for the speaker to match the translated audio, creating a seamless audiovisual experience in the target language.

Translation

Paper
Add Code

Recurrent Neural Network Transducer for Audio-Visual Speech Recognition

1 code implementation • 8 Nov 2019 • Takaki Makino, Hank Liao, Yannis Assael, Brendan Shillingford, Basilio Garcia, Otavio Braga, Olivier Siohan

This work presents a large-scale audio-visual speech recognition system based on a recurrent neural network transducer (RNN-T) architecture.

Ranked #5 on Audio-Visual Speech Recognition on LRS3-TED (using extra training data)

Audio-Visual Speech Recognition Lipreading +2

Paper
Code

Make Up Your Mind! Adversarial Generation of Inconsistent Natural Language Explanations

1 code implementation • ACL 2020 • Oana-Maria Camburu, Brendan Shillingford, Pasquale Minervini, Thomas Lukasiewicz, Phil Blunsom

To increase trust in artificial intelligence systems, a promising research direction consists of designing neural models capable of generating natural language explanations for their predictions.

Decision Making Natural Language Inference

153

Paper
Code

Speech bandwidth extension with WaveNet

no code implementations • 5 Jul 2019 • Archit Gupta, Brendan Shillingford, Yannis Assael, Thomas C. Walters

This paper proposes an approach where a communication node can instead extend the bandwidth of a band-limited incoming speech signal that may have been passed through a low-rate codec.

Bandwidth Extension

Paper
Add Code

Recovering Missing Characters in Old Hawaiian Writing

no code implementations • EMNLP 2018 • Brendan Shillingford, Oiwi Parker Jones

In contrast to the older writing system of the 19th century, modern Hawaiian orthography employs characters for long vowels and glottal stops.

Language Modelling Reading Comprehension +1

Paper
Add Code

Sample Efficient Adaptive Text-to-Speech

no code implementations • ICLR 2019 • Yutian Chen, Yannis Assael, Brendan Shillingford, David Budden, Scott Reed, Heiga Zen, Quan Wang, Luis C. Cobo, Andrew Trask, Ben Laurie, Caglar Gulcehre, Aäron van den Oord, Oriol Vinyals, Nando de Freitas

Instead, the aim is to produce a network that requires few data at deployment time to rapidly adapt to new speakers.

Meta-Learning Voice Similarity

Paper
Add Code

Large-Scale Visual Speech Recognition

no code implementations • ICLR 2019 • Brendan Shillingford, Yannis Assael, Matthew W. Hoffman, Thomas Paine, Cían Hughes, Utsav Prabhu, Hank Liao, Hasim Sak, Kanishka Rao, Lorrayne Bennett, Marie Mulville, Ben Coppin, Ben Laurie, Andrew Senior, Nando de Freitas

To achieve this, we constructed the largest existing visual speech recognition dataset, consisting of pairs of text and video clips of faces speaking (3, 886 hours of video).

Ranked #11 on Lipreading on LRS3-TED (using extra training data)