Search Results for author: Arun Mallya

Found 24 papers, 10 papers with code

SPACE: Speech-driven Portrait Animation with Controllable Expression

no code implementations ICCV 2023 Siddharth Gururani, Arun Mallya, Ting-Chun Wang, Rafael Valle, Ming-Yu Liu

It uses a multi-stage approach, combining the controllability of facial landmarks with the high-quality synthesis power of a pretrained face generator.

Implicit Warping for Animation with Image Sets

no code implementations4 Oct 2022 Arun Mallya, Ting-Chun Wang, Ming-Yu Liu

We present a new implicit warping framework for image animation using sets of source images through the transfer of the motion of a driving video.

Image Animation

AdaViT: Adaptive Tokens for Efficient Vision Transformer

1 code implementation CVPR 2022 Hongxu Yin, Arash Vahdat, Jose Alvarez, Arun Mallya, Jan Kautz, Pavlo Molchanov

A-ViT achieves this by automatically reducing the number of tokens in vision transformers that are processed in the network as inference proceeds.

Efficient ViTs Token Reduction

Multimodal Conditional Image Synthesis with Product-of-Experts GANs

no code implementations9 Dec 2021 Xun Huang, Arun Mallya, Ting-Chun Wang, Ming-Yu Liu

Existing conditional image synthesis frameworks generate images based on user inputs in a single modality, such as text, segmentation, sketch, or style reference.

Image-to-Image Translation

GANcraft: Unsupervised 3D Neural Rendering of Minecraft Worlds

no code implementations ICCV 2021 Zekun Hao, Arun Mallya, Serge Belongie, Ming-Yu Liu

We represent the world as a continuous volumetric function and train our model to render view-consistent photorealistic images for a user-controlled camera.

Neural Rendering

See through Gradients: Image Batch Recovery via GradInversion

2 code implementations CVPR 2021 Hongxu Yin, Arun Mallya, Arash Vahdat, Jose M. Alvarez, Jan Kautz, Pavlo Molchanov

In this work, we introduce GradInversion, using which input images from a larger batch (8 - 48 images) can also be recovered for large networks such as ResNets (50 layers), on complex datasets such as ImageNet (1000 classes, 224x224 px).

Federated Learning Inference Attack +1

One-Shot Free-View Neural Talking-Head Synthesis for Video Conferencing

2 code implementations CVPR 2021 Ting-Chun Wang, Arun Mallya, Ming-Yu Liu

We propose a neural talking-head video synthesis model and demonstrate its application to video conferencing.

Generative Adversarial Networks for Image and Video Synthesis: Algorithms and Applications

no code implementations6 Aug 2020 Ming-Yu Liu, Xun Huang, Jiahui Yu, Ting-Chun Wang, Arun Mallya

The generative adversarial network (GAN) framework has emerged as a powerful tool for various image and video synthesis tasks, allowing the synthesis of visual content in an unconditional or input-conditional manner.

Generative Adversarial Network Neural Rendering +1

World-Consistent Video-to-Video Synthesis

no code implementations ECCV 2020 Arun Mallya, Ting-Chun Wang, Karan Sapra, Ming-Yu Liu

This is because they lack knowledge of the 3D world being rendered and generate each frame only based on the past few frames.

Video-to-Video Synthesis

UNAS: Differentiable Architecture Search Meets Reinforcement Learning

1 code implementation CVPR 2020 Arash Vahdat, Arun Mallya, Ming-Yu Liu, Jan Kautz

Our framework brings the best of both worlds, and it enables us to search for architectures with both differentiable and non-differentiable criteria in one unified framework while maintaining a low search cost.

Neural Architecture Search reinforcement-learning +1

Importance Estimation for Neural Network Pruning

3 code implementations CVPR 2019 Pavlo Molchanov, Arun Mallya, Stephen Tyree, Iuri Frosio, Jan Kautz

On ResNet-101, we achieve a 40% FLOPS reduction by removing 30% of the parameters, with a loss of 0. 02% in the top-1 accuracy on ImageNet.

Network Pruning

Contextual Translation Embedding for Visual Relationship Detection and Scene Graph Generation

no code implementations28 May 2019 Zih-Siou Hung, Arun Mallya, Svetlana Lazebnik

The previous VTransE model maps entities and predicates into a low-dimensional embedding vector space where the predicate is interpreted as a translation vector between the embedded features of the bounding box regions of the subject and the object.

Graph Generation Object +4

Few-Shot Unsupervised Image-to-Image Translation

10 code implementations ICCV 2019 Ming-Yu Liu, Xun Huang, Arun Mallya, Tero Karras, Timo Aila, Jaakko Lehtinen, Jan Kautz

Unsupervised image-to-image translation methods learn to map images in a given class to an analogous image in a different class, drawing on unstructured (non-registered) datasets of images.

Translation Unsupervised Image-To-Image Translation

Piggyback: Adapting a Single Network to Multiple Tasks by Learning to Mask Weights

1 code implementation ECCV 2018 Arun Mallya, Dillon Davis, Svetlana Lazebnik

This work presents a method for adapting a single, fixed deep neural network to multiple tasks without affecting performance on already learned tasks.

Continual Learning Quantization

Recurrent Models for Situation Recognition

no code implementations ICCV 2017 Arun Mallya, Svetlana Lazebnik

This work proposes Recurrent Neural Network (RNN) models to predict structured 'image situations' -- actions and noun entities fulfilling semantic roles related to the action.

Grounded Situation Recognition Human-Object Interaction Detection +1

Combining Multiple Cues for Visual Madlibs Question Answering

no code implementations1 Nov 2016 Tatiana Tommasi, Arun Mallya, Bryan Plummer, Svetlana Lazebnik, Alexander C. Berg, Tamara L. Berg

This paper presents an approach for answering fill-in-the-blank multiple choice questions from the Visual Madlibs dataset.

Attribute General Classification +3

Solving Visual Madlibs with Multiple Cues

no code implementations11 Aug 2016 Tatiana Tommasi, Arun Mallya, Bryan Plummer, Svetlana Lazebnik, Alexander C. Berg, Tamara L. Berg

This paper focuses on answering fill-in-the-blank style multiple choice questions from the Visual Madlibs dataset.

Activity Prediction Attribute +4

Learning Models for Actions and Person-Object Interactions with Transfer to Question Answering

no code implementations16 Apr 2016 Arun Mallya, Svetlana Lazebnik

This paper proposes deep convolutional network models that utilize local and global context to make human activity label predictions in still images, achieving state-of-the-art performance on two recent datasets with hundreds of labels each.

General Classification Human-Object Interaction Detection +4

Learning Informative Edge Maps for Indoor Scene Layout Prediction

no code implementations ICCV 2015 Arun Mallya, Svetlana Lazebnik

We learn to predict 'informative edge' probability maps using two recent methods that exploit local and global context, respectively: structured edge detection forests, and a fully convolutional network for pixelwise labeling.

Edge Detection

Part Localization using Multi-Proposal Consensus for Fine-Grained Categorization

no code implementations22 Jul 2015 Kevin J. Shih, Arun Mallya, Saurabh Singh, Derek Hoiem

We present a simple deep learning framework to simultaneously predict keypoint locations and their respective visibilities and use those to achieve state-of-the-art performance for fine-grained classification.

General Classification

Unsupervised Network Pretraining via Encoding Human Design

no code implementations19 Feb 2015 Ming-Yu Liu, Arun Mallya, Oncel C. Tuzel, Xi Chen

Our idea is to pretrain the network through the task of replicating the process of hand-designed feature extraction.

Object Recognition

Cannot find the paper you are looking for? You can Submit a new open access paper.