Search Results for author: Shiva Sundaram

Found 8 papers, 0 papers with code

Audiovisual Highlight Detection in Videos

no code implementations • 11 Feb 2021 • Karel Mundnich, Alexandra Fenster, Aparna Khare, Shiva Sundaram

To better study the task of highlight detection, we run a pilot experiment with highlights annotations for a small subset of video clips and fine-tune our best model on it.

Highlight Detection Object Recognition +2

Paper
Add Code

Self-Supervised learning with cross-modal transformers for emotion recognition

no code implementations • 20 Nov 2020 • Aparna Khare, Srinivas Parthasarathy, Shiva Sundaram

Self-supervised learning has shown improvements on tasks with limited labeled datasets in domains like speech and natural language.

Emotion Recognition Language Modelling +4

Paper
Add Code

Multi-modal embeddings using multi-task learning for emotion recognition

no code implementations • 10 Sep 2020 • Aparna Khare, Srinivas Parthasarathy, Shiva Sundaram

General embeddings like word2vec, GloVe and ELMo have shown a lot of success in natural language tasks.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +6

Paper
Add Code

Multimodal and Multiresolution Speech Recognition with Transformers

no code implementations • ACL 2020 • Georgios Paraskevopoulos, Srinivas Parthasarathy, Aparna Khare, Shiva Sundaram

We particularly focus on the scene context provided by the visual information, to ground the ASR.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Multiresolution and Multimodal Speech Recognition with Transformers

no code implementations • 29 Apr 2020 • Georgios Paraskevopoulos, Srinivas Parthasarathy, Aparna Khare, Shiva Sundaram

We particularly focus on the scene context provided by the visual information, to ground the ASR.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Robust Multi-channel Speech Recognition using Frequency Aligned Network

no code implementations • 6 Feb 2020 • Taejin Park, Kenichi Kumatani, Minhua Wu, Shiva Sundaram

In this paper, we further develop this idea and use frequency aligned network for robust multi-channel automatic speech recognition (ASR).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Fully Learnable Front-End for Multi-Channel Acoustic Modeling using Semi-Supervised Learning

no code implementations • 1 Feb 2020 • Sanna Wager, Aparna Khare, Minhua Wu, Kenichi Kumatani, Shiva Sundaram

Using a large offline teacher model trained on beamformed audio, we trained a simpler multi-channel student acoustic model used in the speech recognition system.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Improving noise robustness of automatic speech recognition via parallel data and teacher-student learning

no code implementations • 5 Jan 2019 • Ladislav Mošner, Minhua Wu, Anirudh Raju, Sree Hari Krishnan Parthasarathi, Kenichi Kumatani, Shiva Sundaram, Roland Maas, Björn Hoffmeister

For real-world speech recognition applications, noise robustness is still a challenge.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.