Search Results for author: Ankur Narang

Found 12 papers, 1 papers with code

GeoFormer: A Vision and Sequence Transformer-based Approach for Greenhouse Gas Monitoring

no code implementations • 11 Feb 2024 • Madhav Khirwar, Ankur Narang

Air pollution represents a pivotal environmental challenge globally, playing a major role in climate change via greenhouse gas emissions and negatively affecting the health of billions.

Time Series

Paper
Add Code

GeoViT: A Versatile Vision Transformer Architecture for Geospatial Image Analysis

no code implementations • 24 Nov 2023 • Madhav Khirwar, Ankur Narang

Greenhouse gases are pivotal drivers of climate change, necessitating precise quantification and source identification to foster mitigation strategies.

Paper
Add Code

Style Description based Text-to-Speech with Conditional Prosodic Layer Normalization based Diffusion GAN

no code implementations • 27 Oct 2023 • Neeraj Kumar, Ankur Narang, Brejesh lall

In this paper, we present a Diffusion GAN based approach (Prosodic Diff-TTS) to generate the corresponding high-fidelity speech based on the style description and content text as an input to generate speech samples within only 4 denoising steps.

Decoder Denoising

Paper
Add Code

KL Regularized Normalization Framework for Low Resource Tasks

no code implementations • 21 Dec 2022 • Neeraj Kumar, Ankur Narang, Brejesh lall

A lot of normalization techniques have been proposed but the success of normalization in low resource downstream NLP and speech tasks is limited.

Paper
Add Code

One Shot Audio to Animated Video Generation

no code implementations • 19 Feb 2021 • Neeraj Kumar, Srishti Goel, Ankur Narang, Brejesh lall, Mujtaba Hasan, Pranshu Agarwal, Dipankar Sarkar

We propose a novel method OneShotAu2AV to generate an animated video of arbitrary length using an audio clip and a single unseen image of a person as an input.

Video Generation

Paper
Add Code

Robust One Shot Audio to Video Generation

no code implementations • 14 Dec 2020 • Neeraj Kumar, Srishti Goel, Ankur Narang, Mujtaba Hasan

High-quality video generation with expressive facial movements is a challenging problem that involves complex learning steps for generative adversarial networks.

Generative Adversarial Network Marketing +3

Paper
Add Code

Multi Modal Adaptive Normalization for Audio to Video Generation

no code implementations • 14 Dec 2020 • Neeraj Kumar, Srishti Goel, Ankur Narang, Brejesh lall

The multi-modal adaptive normalization uses the various features of audio and video such as Mel spectrogram, pitch, energy from audio signals and predicted keypoint heatmap/optical flow and a single image to learn the respective affine parameters to generate highly expressive video.

Optical Flow Estimation SSIM +1

Paper
Add Code

Few Shot Adaptive Normalization Driven Multi-Speaker Speech Synthesis

no code implementations • 14 Dec 2020 • Neeraj Kumar, Srishti Goel, Ankur Narang, Brejesh lall

High quality multi-speaker speech synthesis while considering prosody and in a few shot manner is an area of active research with many real-world applications.

Cultural Vocal Bursts Intensity Prediction Speech Synthesis