Search Results for author: Alexander G. Schwing

Found 68 papers, 29 papers with code

UFO²: A Unified Framework towards Omni-supervised Object Detection

1 code implementation ECCV 2020 Zhongzheng Ren, Zhiding Yu, Xiaodong Yang, Ming-Yu Liu, Alexander G. Schwing, Jan Kautz

Existing work on object detection often relies on a single form of annotation: the model is trained using either accurate yet costly bounding boxes or cheaper but less expressive image-level tags.

Object object-detection +1

Proposal-based Video Completion

no code implementations ECCV 2020 Yuan-Ting Hu, Heng Wang, Nicolas Ballas, Kristen Grauman, Alexander G. Schwing

Video inpainting is an important technique for a wide variety of applications from video content editing to video restoration.

Image Inpainting object-detection +4

OW-VISCap: Open-World Video Instance Segmentation and Captioning

no code implementations4 Apr 2024 Anwesa Choudhuri, Girish Chowdhary, Alexander G. Schwing

To address these issues, we propose Open-World Video Instance Segmentation and Captioning (OW-VISCap), an approach to jointly segment, track, and caption previously seen or unseen objects in a video.

Descriptive Instance Segmentation +5

StableDreamer: Taming Noisy Score Distillation Sampling for Text-to-3D

no code implementations2 Dec 2023 Pengsheng Guo, Hans Hao, Adam Caccavale, Zhongzheng Ren, Edward Zhang, Qi Shan, Aditya Sankar, Alexander G. Schwing, Alex Colburn, Fangchang Ma

Our analysis identifies the core of these challenges as the interaction among noise levels in the 2D diffusion process, the architecture of the diffusion network, and the 3D model representation.

3D Generation Text to 3D +1

Offline Imitation from Observation via Primal Wasserstein State Occupancy Matching

1 code implementation2 Nov 2023 Kai Yan, Alexander G. Schwing, Yu-Xiong Wang

To address this problem, we propose Primal Wasserstein DICE (PW-DICE), which minimizes the primal Wasserstein distance between the expert and learner state occupancies with a pessimistic regularizer and leverages a contrastively learned distance as the underlying metric for the Wasserstein distance.

Pseudo-Generalized Dynamic View Synthesis from a Video

no code implementations12 Oct 2023 Xiaoming Zhao, Alex Colburn, Fangchang Ma, Miguel Angel Bautista, Joshua M. Susskind, Alexander G. Schwing

In contrast, for dynamic scenes, scene-specific optimization techniques exist, but, to our best knowledge, there is currently no generalized method for dynamic novel view synthesis from a given monocular video.

Novel View Synthesis

Robust Model-Based Optimization for Challenging Fitness Landscapes

1 code implementation23 May 2023 Saba Ghaffari, Ehsan Saleh, Alexander G. Schwing, Yu-Xiong Wang, Martin D. Burke, Saurabh Sinha

Protein design, a grand challenge of the day, involves optimization on a fitness landscape, and leading methods adopt a model-based approach where a model is trained on a training set (protein sequences and fitness) and proposes candidates to explore next.

Benchmarking Protein Design

Context-Aware Relative Object Queries To Unify Video Instance and Panoptic Segmentation

1 code implementation CVPR 2023 Anwesa Choudhuri, Girish Chowdhary, Alexander G. Schwing

We evaluate the proposed approach across three challenging tasks: video instance segmentation, multi-object tracking and segmentation, and video panoptic segmentation.

Instance Segmentation Multi-Object Tracking +8

Learnable Polyphase Sampling for Shift Invariant and Equivariant Convolutional Networks

1 code implementation14 Oct 2022 Renan A. Rojas-Gomez, Teck-Yian Lim, Alexander G. Schwing, Minh N. Do, Raymond A. Yeh

We propose learnable polyphase sampling (LPS), a pair of learnable down/upsampling layers that enable truly shift-invariant and equivariant convolutional networks.

Image Classification Segmentation +1

Controllable Radiance Fields for Dynamic Face Synthesis

no code implementations11 Oct 2022 Peiye Zhuang, Liqian Ma, Oluwasanmi Koyejo, Alexander G. Schwing

Recent work on 3D-aware image synthesis has achieved compelling results using advances in neural rendering.

3D-Aware Image Synthesis Face Generation +2

Learning to Decompose Visual Features with Latent Textual Prompts

no code implementations9 Oct 2022 Feng Wang, Manling Li, Xudong Lin, Hairong Lv, Alexander G. Schwing, Heng Ji

Recent advances in pre-training vision-language models like CLIP have shown great potential in learning transferable visual representations.

Retrieval

Occupancy Planes for Single-view RGB-D Human Reconstruction

1 code implementation4 Aug 2022 Xiaoming Zhao, Yuan-Ting Hu, Zhongzheng Ren, Alexander G. Schwing

Specifically, a set of 3D locations within the view-frustum of the camera are first projected independently onto the image and a corresponding feature is subsequently extracted for each 3D location.

3D Human Reconstruction

Initialization and Alignment for Adversarial Texture Optimization

no code implementations28 Jul 2022 Xiaoming Zhao, Zhizhen Zhao, Alexander G. Schwing

While recovery of geometry from image and video data has received a lot of attention in computer vision, methods to capture the texture for a given geometry are less mature.

Texture Synthesis

XMem: Long-Term Video Object Segmentation with an Atkinson-Shiffrin Memory Model

1 code implementation14 Jul 2022 Ho Kei Cheng, Alexander G. Schwing

We present XMem, a video object segmentation architecture for long videos with unified feature memory stores inspired by the Atkinson-Shiffrin memory model.

 Ranked #1 on Video Object Segmentation on YouTube-VOS 2019 (using extra training data)

2D Human Pose Estimation 2D Object Detection +5

Neural Volumetric Object Selection

no code implementations CVPR 2022 Zhongzheng Ren, Aseem Agarwala, Bryan Russell, Alexander G. Schwing, Oliver Wang

We introduce an approach for selecting objects in neural volumetric 3D representations, such as multi-plane images (MPI) and neural radiance fields (NeRF).

Object Segmentation

Asking for Knowledge: Training RL Agents to Query External Knowledge Using Language

no code implementations12 May 2022 Iou-Jen Liu, Xingdi Yuan, Marc-Alexandre Côté, Pierre-Yves Oudeyer, Alexander G. Schwing

In order to study how agents can be taught to query external knowledge via language, we first introduce two new environments: the grid-world-based Q-BabyAI and the text-based Q-TextWorld.

Equivariance Discovery by Learned Parameter-Sharing

1 code implementation7 Apr 2022 Raymond A. Yeh, Yuan-Ting Hu, Mark Hasegawa-Johnson, Alexander G. Schwing

Designing equivariance as an inductive bias into deep-nets has been a prominent approach to build effective models, e. g., a convolutional neural network incorporates translation equivariance.

Inductive Bias Translation

Mask2Former for Video Instance Segmentation

5 code implementations20 Dec 2021 Bowen Cheng, Anwesa Choudhuri, Ishan Misra, Alexander Kirillov, Rohit Girdhar, Alexander G. Schwing

We find Mask2Former also achieves state-of-the-art performance on video instance segmentation without modifying the architecture, the loss or even the training pipeline.

Image Segmentation Instance Segmentation +5

Semantic Tracklets: An Object-Centric Representation for Visual Multi-Agent Reinforcement Learning

no code implementations6 Aug 2021 Iou-Jen Liu, Zhongzheng Ren, Raymond A. Yeh, Alexander G. Schwing

We evaluate `semantic tracklets' on the visual multi-agent particle environment (VMPE) and on the challenging visual multi-agent GFootball environment.

Multi-agent Reinforcement Learning reinforcement-learning +1

Cooperative Exploration for Multi-Agent Deep Reinforcement Learning

no code implementations23 Jul 2021 Iou-Jen Liu, Unnat Jain, Raymond A. Yeh, Alexander G. Schwing

To address this shortcoming, in this paper, we propose cooperative multi-agent exploration (CMAE): agents share a common goal while exploring.

reinforcement-learning Reinforcement Learning (RL) +2

Per-Pixel Classification is Not All You Need for Semantic Segmentation

3 code implementations NeurIPS 2021 Bowen Cheng, Alexander G. Schwing, Alexander Kirillov

Overall, the proposed mask classification-based method simplifies the landscape of effective approaches to semantic and panoptic segmentation tasks and shows excellent empirical results.

Classification Panoptic Segmentation +1

Robustifying $\ell_\infty$ Adversarial Training to the Union of Perturbation Models

1 code implementation NeurIPS 2021 Ameya D. Patil, Michael Tuttle, Alexander G. Schwing, Naresh R. Shanbhag

Classical adversarial training (AT) frameworks are designed to achieve high adversarial accuracy against a single attack type, typically $\ell_\infty$ norm-bounded perturbations.

3D Spatial Recognition without Spatially Labeled 3D

1 code implementation CVPR 2021 Zhongzheng Ren, Ishan Misra, Alexander G. Schwing, Rohit Girdhar

We introduce WyPR, a Weakly-supervised framework for Point cloud Recognition, requiring only scene-level class tags as supervision.

3D Object Detection Multiple Instance Learning +5

DeepQAMVS: Query-Aware Hierarchical Pointer Networks for Multi-Video Summarization

no code implementations13 May 2021 Safa Messaoud, Ismini Lourentzou, Assma Boughoula, Mona Zehni, Zhizhen Zhao, ChengXiang Zhai, Alexander G. Schwing

The recent growth of web video sharing platforms has increased the demand for systems that can efficiently browse, retrieve and summarize video content.

Video Summarization

Enjoy Your Editing: Controllable GANs for Image Editing via Latent Space Navigation

2 code implementations ICLR 2021 Peiye Zhuang, Oluwasanmi Koyejo, Alexander G. Schwing

Controllable semantic image editing enables a user to change entire image attributes with a few clicks, e. g., gradually making a summer scene look like it was taken in winter.

Attribute Image Manipulation

Assignment-Space-Based Multi-Object Tracking and Segmentation

1 code implementation ICCV 2021 Anwesa Choudhuri, Girish Chowdhary, Alexander G. Schwing

In contrast, we formulate a global method for MOTS over the space of assignments rather than detections: First, we find all top-k assignments of objects detected and segmented between any two consecutive frames and develop a structured prediction formulation to score assignment sequences across any number of consecutive frames.

Multi-Object Tracking Multi-Object Tracking and Segmentation +4

High-Throughput Synchronous Deep RL

1 code implementation NeurIPS 2020 Iou-Jen Liu, Raymond A. Yeh, Alexander G. Schwing

In contrast, asynchronous methods achieve high throughput but suffer from stability issues and lower sample efficiency due to `stale policies.'

Atari Games reinforcement-learning +2

UFO$^2$: A Unified Framework towards Omni-supervised Object Detection

no code implementations21 Oct 2020 Zhongzheng Ren, Zhiding Yu, Xiaodong Yang, Ming-Yu Liu, Alexander G. Schwing, Jan Kautz

Existing work on object detection often relies on a single form of annotation: the model is trained using either accurate yet costly bounding boxes or cheaper but less expressive image-level tags.

object-detection Object Detection

Not All Unlabeled Data are Equal: Learning to Weight Data in Semi-supervised Learning

no code implementations NeurIPS 2020 Zhongzheng Ren, Raymond A. Yeh, Alexander G. Schwing

Existing semi-supervised learning (SSL) algorithms use a single weight to balance the loss of labeled and unlabeled examples, i. e., all unlabeled examples are equally weighted.

Can We Learn Heuristics For Graphical Model Inference Using Reinforcement Learning?

no code implementations CVPR 2020 Safa Messaoud, Maghav Kumar, Alexander G. Schwing

In this paper, we show that we can learn program heuristics, i. e., policies, for solving inference in higher order CRFs for the task of semantic segmentation, using reinforcement learning.

Action Recognition Combinatorial Optimization +5

PIC: Permutation Invariant Critic for Multi-Agent Deep Reinforcement Learning

2 code implementations31 Oct 2019 Iou-Jen Liu, Raymond A. Yeh, Alexander G. Schwing

Sample efficiency and scalability to a large number of agents are two important goals for multi-agent reinforcement learning systems.

Multi-agent Reinforcement Learning reinforcement-learning +1

Co-Generation with GANs using AIS based HMC

1 code implementation NeurIPS 2019 Tiantian Fang, Alexander G. Schwing

Inferring the most likely configuration for a subset of variables of a joint distribution given the remaining ones - which we refer to as co-generation - is an important challenge that is computationally demanding for all but the simplest settings.

Structured Prediction

Chirality Nets for Human Pose Regression

1 code implementation NeurIPS 2019 Raymond A. Yeh, Yuan-Ting Hu, Alexander G. Schwing

We propose Chirality Nets, a family of deep nets that is equivariant to the "chirality transform," i. e., the transformation to create a chiral pair.

3D Human Pose Estimation 3D Pose Estimation +3

TAB-VCR: Tags and Attributes based Visual Commonsense Reasoning Baselines

1 code implementation NeurIPS 2019 Jingxiang Lin, Unnat Jain, Alexander G. Schwing

Despite impressive recent progress that has been reported on tasks that necessitate reasoning, such as visual question answering and visual dialog, models often exploit biases in datasets.

Attribute Question Answering +3

FMRI data augmentation via synthesis

no code implementations13 Jul 2019 Peiye Zhuang, Alexander G. Schwing, Sanmi Koyejo

Thus, our results suggest that data augmentation via synthesis is a promising approach to address the limited availability of fMRI data, and to improve the quality of predictive fMRI models.

Data Augmentation Generative Adversarial Network

Knowledge Flow: Improve Upon Your Teachers

no code implementations ICLR 2019 Iou-Jen Liu, Jian Peng, Alexander G. Schwing

A zoo of deep nets is available these days for almost any given task, and it is increasingly unclear which net to start with when addressing a new task, or which net to use as an initialization for fine-tuning a new model.

Reinforcement Learning (RL)

Out of the Box: Reasoning with Graph Convolution Nets for Factual Visual Question Answering

no code implementations NeurIPS 2018 Medhini Narasimhan, Svetlana Lazebnik, Alexander G. Schwing

Given a question-image pair, deep network techniques have been employed to successively reduce the large set of facts until one of the two entities of the final remaining fact is predicted as the answer.

Factual Visual Question Answering General Knowledge +2

Straight to the Facts: Learning Knowledge Base Retrieval for Factual Visual Question Answering

no code implementations ECCV 2018 Medhini Narasimhan, Alexander G. Schwing

Question answering is an important task for autonomous agents and virtual assistants alike and was shown to support the disabled in efficiently navigating an overwhelming environment.

Factual Visual Question Answering General Knowledge +4

VideoMatch: Matching based Video Object Segmentation

no code implementations ECCV 2018 Yuan-Ting Hu, Jia-Bin Huang, Alexander G. Schwing

Due to the formulation as a prediction task, most of these methods require fine-tuning during test time, such that the deep nets memorize the appearance of the objects of interest in the given video.

Memorization Object +4

Diverse and Coherent Paragraph Generation from Images

no code implementations ECCV 2018 Moitreya Chatterjee, Alexander G. Schwing

Paragraph generation from images, which has gained popularity recently, is an important task for video summarization, editing, and support of the disabled.

Image Captioning Image Paragraph Captioning +1

Unsupervised Textual Grounding: Linking Words to Image Concepts

no code implementations CVPR 2018 Raymond A. Yeh, Minh N. Do, Alexander G. Schwing

Textual grounding, i. e., linking words to objects in images, is a challenging but important task for robotics and human-computer interaction.

Two-sample testing

Hallucinating brains with artificial brains

no code implementations ICLR 2018 Peiye Zhuang, Alexander G. Schwing, Oluwasanmi Koyejo

Our classification results provide a quantitative evaluation of the quality of the generated images, and also serve as an additional contribution of this manuscript.

Learning to Play in a Day: Faster Deep Reinforcement Learning by Optimality Tightening

1 code implementation5 Nov 2016 Frank S. He, Yang Liu, Alexander G. Schwing, Jian Peng

We propose a novel training algorithm for reinforcement learning which combines the strength of deep Q-learning with a constrained optimization approach to tighten optimality and encourage faster reward propagation.

Atari Games Q-Learning +2

Semantic Image Inpainting with Deep Generative Models

7 code implementations CVPR 2017 Raymond A. Yeh, Chen Chen, Teck Yian Lim, Alexander G. Schwing, Mark Hasegawa-Johnson, Minh N. Do

In this paper, we propose a novel method for semantic image inpainting, which generates the missing content by conditioning on the available data.

Image Inpainting

Rent3D: Floor-Plan Priors for Monocular Layout Estimation

no code implementations CVPR 2015 Chenxi Liu, Alexander G. Schwing, Kaustav Kundu, Raquel Urtasun, Sanja Fidler

What sets us apart from past work in layout estimation is the use of floor plans as a source of prior knowledge, as well as localization of each image within a bigger space (apartment).

Learning to Segment Under Various Forms of Weak Supervision

no code implementations CVPR 2015 Jia Xu, Alexander G. Schwing, Raquel Urtasun

Despite the promising performance of conventional fully supervised algorithms, semantic segmentation has remained an important, yet challenging task.

Segmentation Semantic Segmentation

Fully Connected Deep Structured Networks

no code implementations9 Mar 2015 Alexander G. Schwing, Raquel Urtasun

Convolutional neural networks with many layers have recently been shown to achieve excellent results on many high-level tasks such as image classification, object detection and more recently also semantic segmentation.

General Classification Image Classification +6

Learning Deep Structured Models

no code implementations9 Jul 2014 Liang-Chieh Chen, Alexander G. Schwing, Alan L. Yuille, Raquel Urtasun

Towards this goal, we propose a training algorithm that is able to learn structured models jointly with deep features that form the MRF potentials.

Multi-class Classification

Tell Me What You See and I will Show You Where It Is

no code implementations CVPR 2014 Jia Xu, Alexander G. Schwing, Raquel Urtasun

We tackle the problem of weakly labeled semantic segmentation, where the only source of annotation are image tags encoding which classes are present in the scene.

Semantic Segmentation Structured Prediction +1

Efficient Structured Parsing of Facades Using Dynamic Programming

no code implementations CVPR 2014 Andrea Cohen, Alexander G. Schwing, Marc Pollefeys

We propose a sequential optimization technique for segmenting a rectified image of a facade into semantic categories.

General Classification

Cannot find the paper you are looking for? You can Submit a new open access paper.