Search Results for author: Jose M. Alvarez

Found 70 papers, 35 papers with code

Improving Distant 3D Object Detection Using 2D Box Supervision

no code implementations14 Mar 2024 Zetong Yang, Zhiding Yu, Chris Choy, Renhao Wang, Anima Anandkumar, Jose M. Alvarez

This mapping allows the depth estimation of distant objects conditioned on their 2D boxes, making long-range 3D detection with 2D supervision feasible.

3D Object Detection Depth Estimation +2

Causal Perception

no code implementations24 Jan 2024 Jose M. Alvarez, Salvatore Ruggieri

In this work, we formalize perception under causal reasoning to capture the act of interpretation by an individual.

Decision Making Fairness

Fully Attentional Networks with Self-emerging Token Labeling

1 code implementation ICCV 2023 Bingyin Zhao, Zhiding Yu, Shiyi Lan, Yutao Cheng, Anima Anandkumar, Yingjie Lao, Jose M. Alvarez

With the proposed STL framework, our best model based on FAN-L-Hybrid (77. 3M parameters) achieves 84. 8% Top-1 accuracy and 42. 1% mCE on ImageNet-1K and ImageNet-C, and sets a new state-of-the-art for ImageNet-A (46. 1%) and ImageNet-R (56. 6%) without using extra data, outperforming the original FAN counterpart by significant margins.

Semantic Segmentation

Is Ego Status All You Need for Open-Loop End-to-End Autonomous Driving?

1 code implementation5 Dec 2023 Zhiqi Li, Zhiding Yu, Shiyi Lan, Jiahan Li, Jan Kautz, Tong Lu, Jose M. Alvarez

We initially observed that the nuScenes dataset, characterized by relatively simple driving scenarios, leads to an under-utilization of perception information in end-to-end models incorporating ego status, such as the ego vehicle's velocity.

Autonomous Driving

BEVNeXt: Reviving Dense BEV Frameworks for 3D Object Detection

no code implementations4 Dec 2023 Zhenxin Li, Shiyi Lan, Jose M. Alvarez, Zuxuan Wu

Recently, the rise of query-based Transformer decoders is reshaping camera-based 3D object detection.

3D Object Detection Depth Estimation +3

SEGIC: Unleashing the Emergent Correspondence for In-Context Segmentation

1 code implementation24 Nov 2023 Lingchen Meng, Shiyi Lan, Hengduo Li, Jose M. Alvarez, Zuxuan Wu, Yu-Gang Jiang

In-context segmentation aims at segmenting novel images using a few labeled example images, termed as "in-context examples", exploring content similarities between examples and the target.

Meta-Learning One-Shot Segmentation +3

ViR: Towards Efficient Vision Retention Backbones

1 code implementation30 Oct 2023 Ali Hatamizadeh, Michael Ranzinger, Shiyi Lan, Jose M. Alvarez, Sanja Fidler, Jan Kautz

Inspired by this trend, we propose a new class of computer vision models, dubbed Vision Retention Networks (ViR), with dual parallel and recurrent formulations, which strike an optimal balance between fast inference and parallel training with competitive performance.

Towards Viewpoint Robustness in Bird's Eye View Segmentation

no code implementations ICCV 2023 Tzofi Klinghoffer, Jonah Philion, Wenzheng Chen, Or Litany, Zan Gojcic, Jungseock Joo, Ramesh Raskar, Sanja Fidler, Jose M. Alvarez

We introduce a technique for novel view synthesis and use it to transform collected data to the viewpoint of target rigs, allowing us to train BEV segmentation models for diverse target rigs without any additional data collection or labeling cost.

Autonomous Vehicles Novel View Synthesis

The Initial Screening Order Problem

no code implementations28 Jul 2023 Jose M. Alvarez, Antonio Mastropietro, Salvatore Ruggieri

To study the impact of ISO, we introduce a human-like screener and compare to its algorithmic counterpart.

Decision Making Fairness +1

FB-OCC: 3D Occupancy Prediction based on Forward-Backward View Transformation

1 code implementation4 Jul 2023 Zhiqi Li, Zhiding Yu, David Austin, Mingsheng Fang, Shiyi Lan, Jan Kautz, Jose M. Alvarez

This technical report summarizes the winning solution for the 3D Occupancy Prediction Challenge, which is held in conjunction with the CVPR 2023 Workshop on End-to-End Autonomous Driving and CVPR 23 Workshop on Vision-Centric Autonomous Driving Workshop.

Autonomous Driving Prediction Of Occupancy Grid Maps

Domain Adaptive Decision Trees: Implications for Accuracy and Fairness

1 code implementation27 Feb 2023 Jose M. Alvarez, Kristen M. Scott, Salvatore Ruggieri, Bettina Berendt

In uses of pre-trained machine learning models, it is a known issue that the target population in which the model is being deployed may not have been reflected in the source population with which the model was trained.

Domain Adaptation Fairness

Counterfactual Situation Testing: Uncovering Discrimination under Fairness given the Difference

1 code implementation23 Feb 2023 Jose M. Alvarez, Salvatore Ruggieri

For any complainant, we find and compare similar protected and non-protected instances in the dataset used by the classifier to construct a control and test group, where a difference between the decision outcomes of the two groups implies potential individual discrimination.

Attribute counterfactual +2

Vision Transformers Are Good Mask Auto-Labelers

no code implementations CVPR 2023 Shiyi Lan, Xitong Yang, Zhiding Yu, Zuxuan Wu, Jose M. Alvarez, Anima Anandkumar

We propose Mask Auto-Labeler (MAL), a high-quality Transformer-based mask auto-labeling framework for instance segmentation using only box annotations.

Instance Segmentation Segmentation +1

FocalFormer3D: Focusing on Hard Instance for 3D Object Detection

1 code implementation ICCV 2023 Yilun Chen, Zhiding Yu, Yukang Chen, Shiyi Lan, Anima Anandkumar, Jiaya Jia, Jose M. Alvarez

For 3D object detection, we instantiate this method as FocalFormer3D, a simple yet effective detector that excels at excavating difficult objects and improving prediction recall.

3D Object Detection Autonomous Driving +2

Soft Masking for Cost-Constrained Channel Pruning

1 code implementation4 Nov 2022 Ryan Humble, Maying Shen, Jorge Albericio Latorre, Eric Darve1, Jose M. Alvarez

Structured channel pruning has been shown to significantly accelerate inference time for convolution neural networks (CNNs) on modern hardware, with a relatively minor loss of network accuracy.

Structural Pruning via Latency-Saliency Knapsack

1 code implementation13 Oct 2022 Maying Shen, Hongxu Yin, Pavlo Molchanov, Lei Mao, Jianna Liu, Jose M. Alvarez

We propose Hardware-Aware Latency Pruning (HALP) that formulates structural pruning as a global resource allocation optimization problem, aiming at maximizing the accuracy while constraining latency under a predefined budget on targeting device.

Optimizing Data Collection for Machine Learning

no code implementations3 Oct 2022 Rafid Mahmood, James Lucas, Jose M. Alvarez, Sanja Fidler, Marc T. Law

Modern deep learning systems require huge data sets to achieve impressive performance, but there is little guidance on how much or what kind of data to collect.

Object-Level Targeted Selection via Deep Template Matching

no code implementations5 Jul 2022 Suraj Kothawade, Donna Roy, Michele Fenzi, Elmar Haussmann, Jose M. Alvarez, Christoph Angerer

Existing semantic image retrieval methods often focus on mining for larger sized geographical landmarks, and/or require extra labeled data, such as images/image-pairs with similar objects, for mining images with generic objects.

Autonomous Driving Image Retrieval +3

Non-parametric Depth Distribution Modelling based Depth Inference for Multi-view Stereo

1 code implementation CVPR 2022 Jiayu Yang, Jose M. Alvarez, Miaomiao Liu

Boundary pixels usually follow a multi-modal distribution as they represent different depths; Therefore, the assumption results in an erroneous depth prediction at the coarser level of the cost volume pyramid and can not be corrected in the refinement levels leading to wrong depth predictions.

Depth Estimation Depth Prediction

Understanding The Robustness in Vision Transformers

2 code implementations26 Apr 2022 Daquan Zhou, Zhiding Yu, Enze Xie, Chaowei Xiao, Anima Anandkumar, Jiashi Feng, Jose M. Alvarez

Our study is motivated by the intriguing properties of the emerging visual grouping in Vision Transformers, which indicates that self-attention may promote robustness through improved mid-level representations.

Ranked #4 on Domain Generalization on ImageNet-R (using extra training data)

Domain Generalization Image Classification +3

M$^2$BEV: Multi-Camera Joint 3D Detection and Segmentation with Unified Birds-Eye View Representation

no code implementations11 Apr 2022 Enze Xie, Zhiding Yu, Daquan Zhou, Jonah Philion, Anima Anandkumar, Sanja Fidler, Ping Luo, Jose M. Alvarez

In this paper, we propose M$^2$BEV, a unified framework that jointly performs 3D object detection and map segmentation in the Birds Eye View~(BEV) space with multi-camera image inputs.

3D Object Detection object-detection +1

FreeSOLO: Learning to Segment Objects without Annotations

1 code implementation CVPR 2022 Xinlong Wang, Zhiding Yu, Shalini De Mello, Jan Kautz, Anima Anandkumar, Chunhua Shen, Jose M. Alvarez

FreeSOLO further demonstrates superiority as a strong pre-training method, outperforming state-of-the-art self-supervised pre-training methods by +9. 8% AP when fine-tuning instance segmentation with only 5% COCO masks.

Instance Segmentation object-detection +4

Fairness Implications of Encoding Protected Categorical Attributes

2 code implementations27 Jan 2022 Carlos Mougan, Jose M. Alvarez, Salvatore Ruggieri, Steffen Staab

We investigate the interaction between categorical encodings and target encoding regularization methods that reduce unfairness.

Fairness Feature Engineering

Boosting Supervised Learning Performance with Co-training

no code implementations18 Nov 2021 Xinnan Du, William Zhang, Jose M. Alvarez

In this paper, we propose a new light-weight self-supervised learning framework that could boost supervised learning performance with minimum additional computation cost.

Domain Adaptation object-detection +3

When to Prune? A Policy towards Early Structural Pruning

no code implementations CVPR 2022 Maying Shen, Pavlo Molchanov, Hongxu Yin, Jose M. Alvarez

Through extensive experiments on ImageNet, we show that EPI empowers a quick tracking of early training epochs suitable for pruning, offering same efficacy as an otherwise ``oracle'' grid-search that scans through epochs and requires orders of magnitude more compute.

Network Pruning

HALP: Hardware-Aware Latency Pruning

1 code implementation20 Oct 2021 Maying Shen, Hongxu Yin, Pavlo Molchanov, Lei Mao, Jianna Liu, Jose M. Alvarez

We propose Hardware-Aware Latency Pruning (HALP) that formulates structural pruning as a global resource allocation optimization problem, aiming at maximizing the accuracy while constraining latency under a predefined budget.

Privacy Vulnerability of Split Computing to Data-Free Model Inversion Attacks

no code implementations13 Jul 2021 Xin Dong, Hongxu Yin, Jose M. Alvarez, Jan Kautz, Pavlo Molchanov, H. T. Kung

Prior works usually assume that SC offers privacy benefits as only intermediate features, instead of private data, are shared from devices to the cloud.

Optimal Quantization Using Scaled Codebook

no code implementations CVPR 2021 Yerlan Idelbayev, Pavlo Molchanov, Maying Shen, Hongxu Yin, Miguel A. Carreira-Perpinan, Jose M. Alvarez

We study the problem of quantizing N sorted, scalar datapoints with a fixed codebook containing K entries that are allowed to be rescaled.

Quantization

Distilling Image Classifiers in Object Detectors

1 code implementation NeurIPS 2021 Shuxuan Guo, Jose M. Alvarez, Mathieu Salzmann

Knowledge distillation constitutes a simple yet effective way to improve the performance of a compact student network by exploiting the knowledge of a more powerful teacher.

Knowledge Distillation Object +3

SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers

23 code implementations NeurIPS 2021 Enze Xie, Wenhai Wang, Zhiding Yu, Anima Anandkumar, Jose M. Alvarez, Ping Luo

We present SegFormer, a simple, efficient yet powerful semantic segmentation framework which unifies Transformers with lightweight multilayer perception (MLP) decoders.

C++ code Semantic Segmentation +1

See through Gradients: Image Batch Recovery via GradInversion

2 code implementations CVPR 2021 Hongxu Yin, Arun Mallya, Arash Vahdat, Jose M. Alvarez, Jan Kautz, Pavlo Molchanov

In this work, we introduce GradInversion, using which input images from a larger batch (8 - 48 images) can also be recovered for large networks such as ResNets (50 layers), on complex datasets such as ImageNet (1000 classes, 224x224 px).

Federated Learning Inference Attack +1

Image-Level or Object-Level? A Tale of Two Resampling Strategies for Long-Tailed Detection

1 code implementation12 Apr 2021 Nadine Chang, Zhiding Yu, Yu-Xiong Wang, Anima Anandkumar, Sanja Fidler, Jose M. Alvarez

As a result, image resampling alone is not enough to yield a sufficiently balanced distribution at the object level.

Object

Self-supervised Learning of Depth Inference for Multi-view Stereo

1 code implementation CVPR 2021 Jiayu Yang, Jose M. Alvarez, Miaomiao Liu

Here, we propose a self-supervised learning framework for multi-view stereo that exploit pseudo labels from the input data.

Depth Estimation Image Reconstruction +1

Active Learning for Deep Object Detection via Probabilistic Modeling

1 code implementation ICCV 2021 Jiwoong Choi, Ismail Elezi, Hyuk-Jae Lee, Clement Farabet, Jose M. Alvarez

Most of these methods are based on multiple models or are straightforward extensions of classification methods, hence estimate an image's informativeness using only the classification head.

Active Learning Classification +5

Deep Active Learning for Object Detection with Mixture Density Networks

no code implementations1 Jan 2021 Jiwoong Choi, Ismail Elezi, Hyuk-Jae Lee, Clement Farabet, Jose M. Alvarez

For active learning, we propose a scoring function that aggregates uncertainties from both the classification and the localization outputs of the network.

Active Learning Informativeness +3

Personalized Federated Learning with First Order Model Optimization

3 code implementations ICLR 2021 Michael Zhang, Karan Sapra, Sanja Fidler, Serena Yeung, Jose M. Alvarez

While federated learning traditionally aims to train a single global model across decentralized local datasets, one model may not always be ideal for all participating clients.

Model Optimization Personalized Federated Learning

Context Based Emotion Recognition using EMOTIC Dataset

3 code implementations30 Mar 2020 Ronak Kosti, Jose M. Alvarez, Adria Recasens, Agata Lapedriza

In this paper we present EMOTIC, a dataset of images of people in a diverse set of natural situations, annotated with their apparent emotion.

Ranked #4 on Emotion Recognition in Context on EMOTIC (using extra training data)

Emotion Recognition in Context

Training Data Distribution Search with Ensemble Active Learning

no code implementations25 Sep 2019 Kashyap Chitta, Jose M. Alvarez, Elmar Haussmann, Clement Farabet

In this paper, we propose to scale up ensemble Active Learning methods to perform acquisition at a large scale (10k to 500k samples at a time).

Active Learning Image Classification

Quadtree Generating Networks: Efficient Hierarchical Scene Parsing with Sparse Convolutions

1 code implementation27 Jul 2019 Kashyap Chitta, Jose M. Alvarez, Martial Hebert

Semantic segmentation with Convolutional Neural Networks is a memory-intensive task due to the high spatial resolution of feature maps and output predictions.

Scene Parsing Segmentation +1

Training Data Subset Search with Ensemble Active Learning

no code implementations29 May 2019 Kashyap Chitta, Jose M. Alvarez, Elmar Haussmann, Clement Farabet

In this paper, we propose to scale up ensemble Active Learning (AL) methods to perform acquisition at a large scale (10k to 500k samples at a time).

Active Learning Autonomous Driving +3

ExpandNets: Linear Over-parameterization to Train Compact Convolutional Networks

no code implementations NeurIPS 2020 Shuxuan Guo, Jose M. Alvarez, Mathieu Salzmann

As evidenced by our experiments, our approach outperforms both training the compact network from scratch and performing knowledge distillation from a teacher.

General Classification Image Classification +5

Deep Probabilistic Ensembles: Approximate Variational Inference through KL Regularization

no code implementations6 Nov 2018 Kashyap Chitta, Jose M. Alvarez, Adam Lesnikowski

In this paper, we introduce Deep Probabilistic Ensembles (DPEs), a scalable technique that uses a regularized ensemble to approximate a deep Bayesian Neural Network (BNN).

Active Learning General Classification +1

Effective Use of Synthetic Data for Urban Scene Semantic Segmentation

no code implementations ECCV 2018 Fatemeh Sadat Saleh, Mohammad Sadegh Aliakbarian, Mathieu Salzmann, Lars Petersson, Jose M. Alvarez

Our approach builds on the observation that foreground and background classes are not affected in the same manner by the domain shift, and thus should be treated differently.

Domain Adaptation Semantic Segmentation

Compression-aware Training of Deep Networks

no code implementations NeurIPS 2017 Jose M. Alvarez, Mathieu Salzmann

In recent years, great progress has been made in a variety of application domains thanks to the development of increasingly deeper neural networks.

Domain-adaptive deep network compression

2 code implementations ICCV 2017 Marc Masana, Joost Van de Weijer, Luis Herranz, Andrew D. Bagdanov, Jose M. Alvarez

We show that domain transfer leads to large shifts in network activations and that it is desirable to take this into account when compressing.

Low-rank compression

Bringing Background into the Foreground: Making All Classes Equal in Weakly-supervised Video Semantic Segmentation

no code implementations ICCV 2017 Fatemeh Sadat Saleh, Mohammad Sadegh Aliakbarian, Mathieu Salzmann, Lars Petersson, Jose M. Alvarez

Our experiments demonstrate the benefits of our classifier heatmaps and of our two-stream architecture on challenging urban scene datasets and on the YouTube-Objects benchmark, where we obtain state-of-the-art results.

Autonomous Navigation Segmentation +3

Class-Weighted Convolutional Features for Visual Instance Search

2 code implementations9 Jul 2017 Albert Jimenez, Jose M. Alvarez, Xavier Giro-i-Nieto

In this paper, we go beyond this spatial information and propose a local-aware encoding of convolutional features based on semantic information predicted in the target image.

Image Retrieval Instance Search +2

Emotion Recognition in Context

no code implementations CVPR 2017 Ronak Kosti, Jose M. Alvarez, Adria Recasens, Agata Lapedriza

In this paper we present the Emotions in Context Database (EMCO), a dataset of images containing people in context in non-controlled environments.

Emotion Recognition in Context

Incorporating Network Built-in Priors in Weakly-supervised Semantic Segmentation

no code implementations6 Jun 2017 Fatemeh Sadat Saleh, Mohammad Sadegh Aliakbarian, Mathieu Salzmann, Lars Petersson, Jose M. Alvarez, Stephen Gould

We then show how to obtain multi-class masks by the fusion of foreground/background ones with information extracted from a weakly-supervised localization network.

Object Recognition Segmentation +3

Learning the Number of Neurons in Deep Networks

no code implementations NeurIPS 2016 Jose M. Alvarez, Mathieu Salzmann

In this paper, we introduce an approach to automatically determining the number of neurons in each layer of a deep network during learning.

Learning Image Matching by Simply Watching Video

no code implementations19 Mar 2016 Gucan Long, Laurent Kneip, Jose M. Alvarez, Hongdong Li

This work presents an unsupervised learning based approach to the ubiquitous computer vision problem of image matching.

Road Detection by One-Class Color Classification: Dataset and Experiments

no code implementations11 Dec 2014 Jose M. Alvarez, Theo Gevers, Antonio M. Lopez

These algorithms reduce the effect of lighting variations and weather conditions by exploiting the discriminant/invariant properties of different color representations.

Autonomous Driving Classification +1

Cannot find the paper you are looking for? You can Submit a new open access paper.