Search Results for author: Alexander Schwing

Found 45 papers, 18 papers with code

Virtual Pets: Animatable Animal Generation in 3D Scenes

no code implementations • 21 Dec 2023 • Yen-Chi Cheng, Chieh Hubert Lin, Chaoyang Wang, Yash Kant, Sergey Tulyakov, Alexander Schwing, LiangYan Gui, Hsin-Ying Lee

Toward unlocking the potential of generative models in immersive 4D experiences, we introduce Virtual Pet, a novel pipeline to model realistic and diverse motions for target animal species within a 3D environment.

Paper
Add Code

Putting the Object Back into Video Object Segmentation

1 code implementation • 19 Oct 2023 • Ho Kei Cheng, Seoung Wug Oh, Brian Price, Joon-Young Lee, Alexander Schwing

We present Cutie, a video object segmentation (VOS) network with object-level memory reading, which puts the object representation from memory back into the video object segmentation result.

Ranked #1 on Semi-Supervised Video Object Segmentation on MOSE

Object Segmentation +3

455

Paper
Code

Tracking Anything with Decoupled Video Segmentation

1 code implementation • ICCV 2023 • Ho Kei Cheng, Seoung Wug Oh, Brian Price, Alexander Schwing, Joon-Young Lee

To 'track anything' without training on video data for every individual task, we develop a decoupled video segmentation approach (DEVA), composed of task-specific image-level segmentation and class/task-agnostic bi-directional temporal propagation.

Ranked #1 on Unsupervised Video Object Segmentation on DAVIS 2016 val (using extra training data)

Open-Vocabulary Video Segmentation Open-World Video Segmentation +7

1,050

Paper
Code

SDFusion: Multimodal 3D Shape Completion, Reconstruction, and Generation

1 code implementation • CVPR 2023 • Yen-Chi Cheng, Hsin-Ying Lee, Sergey Tulyakov, Alexander Schwing, LiangYan Gui

To enable interactive generation, our method supports a variety of input modalities that can be easily provided by a human, including images, text, partially observed shapes and combinations of these, further allowing to adjust the strength of each input.

3D Reconstruction 3D Shape Generation +2

362

Paper
Code

RGB-Only Reconstruction of Tabletop Scenes for Collision-Free Manipulator Control

no code implementations • 21 Oct 2022 • Zhenggang Tang, Balakumar Sundaralingam, Jonathan Tremblay, Bowen Wen, Ye Yuan, Stephen Tyree, Charles Loop, Alexander Schwing, Stan Birchfield

We present a system for collision-free control of a robot manipulator that uses only RGB views of the world.

Model Predictive Control

Paper
Add Code

On the Importance of Gradient Norm in PAC-Bayesian Bounds

no code implementations • 12 Oct 2022 • Itai Gat, Yossi Adi, Alexander Schwing, Tamir Hazan

Generalization bounds which assess the difference between the true risk and the empirical risk, have been studied extensively.

Generalization Bounds

Paper
Add Code

Joint Forecasting of Panoptic Segmentations with Difference Attention

1 code implementation • CVPR 2022 • Colin Graber, Cyril Jazra, Wenjie Luo, LiangYan Gui, Alexander Schwing

For this, panoptic segmentations have been studied as a compelling representation in recent work.

Object Panoptic Segmentation +1

Paper
Code

MuMuQA: Multimedia Multi-Hop News Question Answering via Cross-Media Knowledge Extraction and Grounding

2 code implementations • 20 Dec 2021 • Revanth Gangi Reddy, Xilin Rui, Manling Li, Xudong Lin, Haoyang Wen, Jaemin Cho, Lifu Huang, Mohit Bansal, Avirup Sil, Shih-Fu Chang, Alexander Schwing, Heng Ji

Specifically, the task involves multi-hop questions that require reasoning over image-caption pairs to identify the grounded visual object being referred to and then predicting a span from the news body text to answer the question.

Answer Generation Data Augmentation +2

697

Paper
Code

Perceptual Score: What Data Modalities Does Your Model Perceive?

1 code implementation • NeurIPS 2021 • Itai Gat, Idan Schwartz, Alexander Schwing

To study and quantify this concern, we introduce the perceptual score, a metric that assesses the degree to which a model relies on the different subsets of the input features, i. e., modalities.

Question Answering Visual Dialog +1

Paper
Code

CoVA: Context-aware Visual Attention for Webpage Information Extraction

1 code implementation • ECNLP (ACL) 2022 • Anurendra Kumar, Keval Morabia, Jingjin Wang, Kevin Chen-Chuan Chang, Alexander Schwing

To address this challenge we propose to reformulate WIE as a context-aware Webpage Object Detection task.

Ranked #1 on Webpage Object Detection on CoVA (using extra training data)

object-detection Object Detection +1

Paper
Code

Interpretation of Emergent Communication in Heterogeneous Collaborative Embodied Agents

no code implementations • ICCV 2021 • Shivansh Patel, Saim Wani, Unnat Jain, Alexander Schwing, Svetlana Lazebnik, Manolis Savva, Angel X. Chang

We show that the emergent communication can be grounded to the agent observations and the spatial structure of the 3D environment.

Paper
Add Code

The Surprising Effectiveness of Visual Odometry Techniques for Embodied PointGoal Navigation

no code implementations • ICCV 2021 • Xiaoming Zhao, Harsh Agrawal, Dhruv Batra, Alexander Schwing

It is fundamental for personal robots to reliably navigate to a specified goal.

Navigate PointGoal Navigation +1

Paper
Add Code

Ordered Attention for Coherent Visual Storytelling

no code implementations • 4 Aug 2021 • Tom Braude, Idan Schwartz, Alexander Schwing, Ariel Shamir

OIA models interactions between the sentence-corresponding image and important regions in other images of the sequence.

Sentence Visual Storytelling

Paper
Add Code

GridToPix: Training Embodied Agents with Minimal Supervision

no code implementations • ICCV 2021 • Unnat Jain, Iou-Jen Liu, Svetlana Lazebnik, Aniruddha Kembhavi, Luca Weihs, Alexander Schwing

While deep reinforcement learning (RL) promises freedom from hand-labeled data, great successes, especially for Embodied AI, require significant work to create supervision via carefully shaped rewards.

PointGoal Navigation Reinforcement Learning (RL) +1

Paper
Add Code

Panoptic Segmentation Forecasting

no code implementations • CVPR 2021 • Colin Graber, Grace Tsai, Michael Firman, Gabriel Brostow, Alexander Schwing

Following this decomposition, we introduce panoptic segmentation forecasting.

Panoptic Segmentation Segmentation

Paper
Add Code

Removing Bias in Multi-modal Classifiers: Regularization by Maximizing Functional Entropies

1 code implementation • NeurIPS 2020 • Itai Gat, Idan Schwartz, Alexander Schwing, Tamir Hazan

However, regularization with the functional entropy is challenging.

Ranked #3 on Visual Question Answering (VQA) on VQA-CP

Question Answering Visual Question Answering

Paper
Code

A Contrastive Learning Approach for Training Variational Autoencoder Priors

no code implementations • NeurIPS 2021 • Jyoti Aneja, Alexander Schwing, Jan Kautz, Arash Vahdat

To tackle this issue, we propose an energy-based prior defined by the product of a base prior distribution and a reweighting factor, designed to bring the base closer to the aggregate posterior.

Ranked #6 on Image Generation on CelebA 256x256 (FID metric)

Contrastive Learning Image Generation

Paper
Add Code

Bridging the Imitation Gap by Adaptive Insubordination

no code implementations • NeurIPS 2021 • Luca Weihs, Unnat Jain, Iou-Jen Liu, Jordi Salvador, Svetlana Lazebnik, Aniruddha Kembhavi, Alexander Schwing

However, we show that when the teaching agent makes decisions with access to privileged information that is unavailable to the student, this information is marginalized during imitation learning, resulting in an "imitation gap" and, potentially, poor results.

Imitation Learning Memorization +2

Paper
Add Code

A Cordial Sync: Going Beyond Marginal Policies for Multi-Agent Embodied Tasks

no code implementations • ECCV 2020 • Unnat Jain, Luca Weihs, Eric Kolve, Ali Farhadi, Svetlana Lazebnik, Aniruddha Kembhavi, Alexander Schwing

Autonomous agents must learn to collaborate.

Paper
Add Code

The 1st Agriculture-Vision Challenge: Methods and Results

1 code implementation • 21 Apr 2020 • Mang Tik Chiu, Xingqian Xu, Kai Wang, Jennifer Hobbs, Naira Hovakimyan, Thomas S. Huang, Honghui Shi, Yunchao Wei, Zilong Huang, Alexander Schwing, Robert Brunner, Ivan Dozier, Wyatt Dozier, Karen Ghandilyan, David Wilson, Hyunseong Park, Junhee Kim, Sungho Kim, Qinghui Liu, Michael C. Kampffmeyer, Robert Jenssen, Arnt B. Salberg, Alexandre Barbosa, Rodrigo Trevisan, Bingchen Zhao, Shaozuo Yu, Siwei Yang, Yin Wang, Hao Sheng, Xiao Chen, Jingyi Su, Ram Rajagopal, Andrew Ng, Van Thong Huynh, Soo-Hyung Kim, In-Seop Na, Ujjwal Baid, Shubham Innani, Prasad Dutande, Bhakti Baheti, Sanjay Talbar, Jianyu Tang

The first Agriculture-Vision Challenge aims to encourage research in developing novel and effective algorithms for agricultural pattern recognition from aerial images, especially for the semantic segmentation task associated with our challenge dataset.

Segmentation Semantic Segmentation

Paper
Code

Disentangling Controllable Object through Video Prediction Improves Visual Reinforcement Learning

no code implementations • 21 Feb 2020 • Yuanyi Zhong, Alexander Schwing, Jian Peng

In many vision-based reinforcement learning (RL) problems, the agent controls a movable object in its visual field, e. g., the player's avatar in video games and the robotic arm in visual grasping and manipulation.

Atari Games Object +3

Paper
Add Code

Agriculture-Vision: A Large Aerial Image Database for Agricultural Pattern Analysis

2 code implementations • CVPR 2020 • Mang Tik Chiu, Xingqian Xu, Yunchao Wei, Zilong Huang, Alexander Schwing, Robert Brunner, Hrant Khachatrian, Hovnatan Karapetyan, Ivan Dozier, Greg Rose, David Wilson, Adrian Tudor, Naira Hovakimyan, Thomas S. Huang, Honghui Shi

To encourage research in computer vision for agriculture, we present Agriculture-Vision: a large-scale aerial farmland image dataset for semantic segmentation of agricultural patterns.

Segmentation Semantic Segmentation

Paper
Code

Calorimetry with Deep Learning: Particle Simulation and Reconstruction for Collider Physics

no code implementations • 14 Dec 2019 • Dawit Belayneh, Federico Carminati, Amir Farbin, Benjamin Hooberman, Gulrukh Khattak, Miaoyuan Liu, Junze Liu, Dominick Olivito, Vitória Barin Pacela, Maurizio Pierini, Alexander Schwing, Maria Spiropulu, Sofia Vallecorsa, Jean-Roch Vlimant, Wei Wei, Matt Zhang

These networks can serve as fast and computationally light methods for particle shower simulation and reconstruction for current and future experiments at particle colliders.

Paper
Add Code

TAB-VCR: Tags and Attributes based VCR Baselines

1 code implementation • NeurIPS 2019 • Jingxiang Lin, Unnat Jain, Alexander Schwing

Despite impressive recent progress that has been reported on tasks that necessitate reasoning, such as visual question answering and visual dialog, models often exploit biases in datasets.

Attribute Question Answering +3

Paper
Code

Graph Structured Prediction Energy Networks

1 code implementation • NeurIPS 2019 • Colin Graber, Alexander Schwing

For joint inference over multiple variables, a variety of structured prediction techniques have been developed to model correlations among variables and thereby improve predictions.

Structured Prediction

Paper
Code

Towards Principled Objectives for Contrastive Disentanglement

no code implementations • 25 Sep 2019 • Anwesa Choudhuri, Ashok Vardhan Makkuva, Ranvir Rana, Sewoong Oh, Girish Chowdhary, Alexander Schwing

%In fact, contrastive disentanglement and unsupervised recovery are often combined in that we seek additional variations that exhibit salient factors/properties.

Disentanglement

Paper
Add Code

Unsupervised Discovery of Dynamic Neural Circuits

no code implementations • NeurIPS Workshop Neuro_AI 2019 • Colin Graber, Ryan Loh, Yurii Vlasov, Alexander Schwing

What can we learn about the functional organization of cortical microcircuits from large-scale recordings of neural activity?

Paper
Add Code

ViCo: Word Embeddings from Visual Co-occurrences

1 code implementation • ICCV 2019 • Tanmay Gupta, Alexander Schwing, Derek Hoiem

Through unsupervised clustering, supervised partitioning, and a zero-shot-like generalization analysis we show that our word embeddings complement text-only embeddings like GloVe by better representing similarities and differences between visual concepts that are difficult to obtain from text corpora alone.

Attribute Clustering +1

Paper
Code

Sequential Latent Spaces for Modeling the Intention During Diverse Image Captioning

no code implementations • ICCV 2019 • Jyoti Aneja, Harsh Agrawal, Dhruv Batra, Alexander Schwing

We encourage this temporal latent space to capture the 'intention' about how to complete the sentence by mimicking a representation which summarizes the future.

Image Captioning Language Modelling +1

Paper
Add Code

Factor Graph Attention

1 code implementation • CVPR 2019 • Idan Schwartz, Seunghak Yu, Tamir Hazan, Alexander Schwing

We address this issue and develop a general attention mechanism for visual dialog which operates on any number of data utilities.

Ranked #1 on Visual Dialog on VisDial v0.9 val

Graph Attention Question Answering +2

Paper
Code

Max-Sliced Wasserstein Distance and its use for GANs

no code implementations • CVPR 2019 • Ishan Deshpande, Yuan-Ting Hu, Ruoyu Sun, Ayis Pyrros, Nasir Siddiqui, Sanmi Koyejo, Zhizhen Zhao, David Forsyth, Alexander Schwing

Generative adversarial nets (GANs) and variational auto-encoders have significantly improved our distribution modeling capabilities, showing promise for dataset augmentation, image-to-image translation and feature learning.

Image-to-Image Translation Translation

Paper
Add Code

Two Body Problem: Collaborative Visual Task Completion

no code implementations • CVPR 2019 • Unnat Jain, Luca Weihs, Eric Kolve, Mohammad Rastegari, Svetlana Lazebnik, Ali Farhadi, Alexander Schwing, Aniruddha Kembhavi

Collaboration is a necessary skill to perform tasks that are beyond one agent's capabilities.

Task 2 Vocal Bursts Valence Prediction

Paper
Add Code

No-Frills Human-Object Interaction Detection: Factorization, Layout Encodings, and Training Techniques

3 code implementations • ICCV 2019 • Tanmay Gupta, Alexander Schwing, Derek Hoiem

We show that for human-object interaction detection a relatively simple factorized model with appearance and layout encodings constructed from pre-trained object detectors outperforms more sophisticated approaches.

Human-Object Interaction Detection Object

Paper
Code

GradiVeQ: Vector Quantization for Bandwidth-Efficient Gradient Aggregation in Distributed CNN Training

no code implementations • NeurIPS 2018 • Mingchao Yu, Zhifeng Lin, Krishna Narra, Songze Li, Youjie Li, Nam Sung Kim, Alexander Schwing, Murali Annavaram, Salman Avestimehr

Data parallelism can boost the training speed of convolutional neural networks (CNN), but could suffer from significant communication costs caused by gradient aggregation.

Dimensionality Reduction Quantization

Paper
Add Code

Pipe-SGD: A Decentralized Pipelined SGD Framework for Distributed Deep Net Training

no code implementations • NeurIPS 2018 • Youjie Li, Mingchao Yu, Songze Li, Salman Avestimehr, Nam Sung Kim, Alexander Schwing

Distributed training of deep nets is an important technique to address some of the present day computing challenges like memory consumption and computational demands.

Paper
Add Code

Deep Structured Prediction with Nonlinear Output Transformations

1 code implementation • NeurIPS 2018 • Colin Graber, Ofer Meshi, Alexander Schwing

Deep structured models are widely used for tasks like semantic segmentation, where explicit correlations between variables provide important prior information which generally helps to reduce the data needs of deep nets.

Semantic Segmentation Structured Prediction

Paper
Code

Fast, Diverse and Accurate Image Captioning Guided By Part-of-Speech

no code implementations • CVPR 2019 • Aditya Deshpande, Jyoti Aneja, Li-Wei Wang, Alexander Schwing, D. A. Forsyth

We achieve the trifecta: (1) High accuracy for the diverse captions as evaluated by standard captioning metrics and user studies; (2) Faster computation of diverse captions compared to beam search and diverse beam search; and (3) High diversity as evaluated by counting novel sentences, distinct n-grams and mutual overlap (i. e., mBleu-4) scores.

Caption Generation Image Captioning

Paper
Add Code

Generative Modeling using the Sliced Wasserstein Distance

1 code implementation • CVPR 2018 • Ishan Deshpande, Ziyu Zhang, Alexander Schwing

While this is particularly true for early GAN formulations, there has been significant empirically motivated and theoretically founded progress to improve stability, for instance, by using the Wasserstein distance rather than the Jenson-Shannon divergence.

Paper
Code

Two can play this Game: Visual Dialog with Discriminative Question Generation and Answering

no code implementations • CVPR 2018 • Unnat Jain, Svetlana Lazebnik, Alexander Schwing

In addition, for the first time on the visual dialog dataset, we assess the performance of a system asking questions, and demonstrate how visual dialog can be generated from discriminative question generation and question answering.

Ranked #7 on Visual Dialog on VisDial v0.9 val

Image Captioning Question Answering +4

Paper
Add Code

Asynchronous Parallel Coordinate Minimization for MAP Inference

no code implementations • NeurIPS 2017 • Ofer Meshi, Alexander Schwing

Finding the maximum a-posteriori (MAP) assignment is a central task in graphical models.

Paper
Add Code

Convolutional Image Captioning

4 code implementations • CVPR 2018 • Jyoti Aneja, Aditya Deshpande, Alexander Schwing

In recent years significant progress has been made in image captioning, using Recurrent Neural Networks powered by long-short-term-memory (LSTM) units.

Image Captioning Text Generation +1

130

Paper
Code

Dualing GANs

no code implementations • NeurIPS 2017 • Yujia Li, Alexander Schwing, Kuan-Chieh Wang, Richard Zemel

We start from linear discriminators in which case conjugate duality provides a mechanism to reformulate the saddle point objective into a maximization problem, such that both the generator and the discriminator of this 'dualing GAN' act in concert.

Paper
Add Code

Creativity: Generating Diverse Questions using Variational Autoencoders

no code implementations • CVPR 2017 • Unnat Jain, Ziyu Zhang, Alexander Schwing

Generating diverse questions for given images is an important task for computational education, entertainment and AI assistants.

Question Generation Question-Generation

Paper
Add Code

Statistical Inference, Learning and Models in Big Data

no code implementations • 9 Sep 2015 • Beate Franke, Jean-François Plante, Ribana Roscher, Annie Lee, Cathal Smyth, Armin Hatefi, Fuqi Chen, Einat Gil, Alexander Schwing, Alessandro Selvitella, Michael M. Hoffman, Roger Grosse, Dieter Hendricks, Nancy Reid

The need for new methods to deal with big data is a common theme in most scientific fields, although its definition tends to vary with the context.

Paper
Add Code

Blending Learning and Inference in Structured Prediction

no code implementations • 8 Oct 2012 • Tamir Hazan, Alexander Schwing, David Mcallester, Raquel Urtasun

In this paper we derive an efficient algorithm to learn the parameters of structured predictors in general graphical models.

Scene Understanding Semantic Segmentation +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.