Search Results for author: Stefano Soatto

Found 171 papers, 46 papers with code

Incremental Few-Shot Meta-Learning via Indirect Discriminant Alignment

no code implementations ECCV 2020 Qing Liu, Orchid Majumder, Alessandro Achille, Avinash Ravichandran, Rahul Bhotika, Stefano Soatto

This process enables incrementally improving the model by processing multiple learning episodes, each representing a different learning task, even with few training examples.

Few-Shot Learning Incremental Learning

Multi-Modal Hallucination Control by Visual Information Grounding

no code implementations20 Mar 2024 Alessandro Favero, Luca Zancato, Matthew Trager, Siddharth Choudhary, Pramuditha Perera, Alessandro Achille, Ashwin Swaminathan, Stefano Soatto

In particular, we show that as more tokens are generated, the reliance on the visual prompt decreases, and this behavior strongly correlates with the emergence of hallucinations.

Hallucination Visual Question Answering (VQA)

Fast Sparse View Guided NeRF Update for Object Reconfigurations

no code implementations16 Mar 2024 Ziqi Lu, Jianbo Ye, Xiaohan Fei, Xiaolong Li, Jiawei Mo, Ashwin Swaminathan, Stefano Soatto

Neural Radiance Field (NeRF), as an implicit 3D scene representation, lacks inherent ability to accommodate changes made to the initial static scene.

Enhancing Vision-Language Pre-training with Rich Supervisions

no code implementations5 Mar 2024 Yuan Gao, Kunyu Shi, Pengkai Zhu, Edouard Belval, Oren Nuriel, Srikar Appalaraju, Shabnam Ghadar, Vijay Mahadevan, Zhuowen Tu, Stefano Soatto

We propose Strongly Supervised pre-training with ScreenShots (S4) - a novel pre-training paradigm for Vision-Language Models using data from large-scale web screenshot rendering.

Table Detection

Non-autoregressive Sequence-to-Sequence Vision-Language Models

no code implementations4 Mar 2024 Kunyu Shi, Qi Dong, Luis Goncalves, Zhuowen Tu, Stefano Soatto

Sequence-to-sequence vision-language models are showing promise, but their applicability is limited by their inference latency due to their autoregressive way of generating predictions.

Language Modelling

A Quantitative Evaluation of Score Distillation Sampling Based Text-to-3D

no code implementations29 Feb 2024 Xiaohan Fei, Chethan Parameshwara, Jiawei Mo, Xiaolong Li, Ashwin Swaminathan, Cj Taylor, Paolo Favaro, Stefano Soatto

However, the SDS method is also the source of several artifacts, such as the Janus problem, the misalignment between the text prompt and the generated 3D model, and 3D model inaccuracies.

Image Generation Text to 3D

Meaning Representations from Trajectories in Autoregressive Models

1 code implementation23 Oct 2023 Tian Yu Liu, Matthew Trager, Alessandro Achille, Pramuditha Perera, Luca Zancato, Stefano Soatto

We propose to extract meaning representations from autoregressive language models by considering the distribution of all possible trajectories extending an input text.

Semantic Similarity Semantic Textual Similarity

AugUndo: Scaling Up Augmentations for Unsupervised Depth Completion

no code implementations15 Oct 2023 Yangchao Wu, Tian Yu Liu, Hyoungseob Park, Stefano Soatto, Dong Lao, Alex Wong

The sparse depth modality have seen even less as intensity transformations alter the scale of the 3D scene, and geometric transformations may decimate the sparse points during resampling.

Data Augmentation Depth Completion +1

Sub-token ViT Embedding via Stochastic Resonance Transformers

no code implementations6 Oct 2023 Dong Lao, Yangchao Wu, Tian Yu Liu, Alex Wong, Stefano Soatto

We term our method ``Stochastic Resonance Transformer" (SRT), which we show can effectively super-resolve features of pre-trained ViTs, capturing more of the local fine-grained structures that might otherwise be neglected as a result of tokenization.

Depth Estimation Depth Prediction +6

Critical Learning Periods Emerge Even in Deep Linear Networks

no code implementations23 Aug 2023 Michael Kleinman, Alessandro Achille, Stefano Soatto

Critical learning periods are periods early in development where temporary sensory deficits can have a permanent effect on behavior and learned representations.

Multi-Task Learning

Training Data Protection with Compositional Diffusion Models

no code implementations2 Aug 2023 Aditya Golatkar, Alessandro Achille, Ashwin Swaminathan, Stefano Soatto

We introduce Compartmentalized Diffusion Models (CDM), a method to train different diffusion models (or prompts) on distinct data sources and arbitrarily compose them at inference time.

Continual Learning Memorization +1

Tangent Model Composition for Ensembling and Continual Fine-tuning

no code implementations ICCV 2023 Tian Yu Liu, Stefano Soatto

Component models are composed at inference time via scalar combination, reducing the cost of ensembling to that of a single model.

Incremental Learning

Tangent Transformers for Composition, Privacy and Removal

no code implementations16 Jul 2023 Tian Yu Liu, Aditya Golatkar, Stefano Soatto

We introduce Tangent Attention Fine-Tuning (TAFT), a method for fine-tuning linearized transformers obtained by computing a First-order Taylor Expansion around a pre-trained initialization.

Machine Unlearning

Towards Visual Foundational Models of Physical Scenes

no code implementations6 Jun 2023 Chethan Parameshwara, Alessandro Achille, Xiaolong Li, Jiawei Mo, Matthew Trager, Ashwin Swaminathan, Cj Taylor, Dheera Venkatraman, Xiaohan Fei, Stefano Soatto

We describe a first step towards learning general-purpose visual representations of physical scenes using only image prediction as a training criterion.

Prompt Algebra for Task Composition

no code implementations1 Jun 2023 Pramuditha Perera, Matthew Trager, Luca Zancato, Alessandro Achille, Stefano Soatto

We investigate whether prompts learned independently for different tasks can be later combined through prompt algebra to obtain a model that supports composition of tasks.

Attribute Classification

Taming AI Bots: Controllability of Neural States in Large Language Models

no code implementations29 May 2023 Stefano Soatto, Paulo Tabuada, Pratik Chaudhari, Tian Yu Liu

We then characterize the subset of meanings that can be reached by the state of the LLMs for some input prompt, and show that a well-trained bot can reach any meaning albeit with small probability.

Learning for Transductive Threshold Calibration in Open-World Recognition

no code implementations19 May 2023 Qin Zhang, Dongsheng An, Tianjun Xiao, Tong He, Qingming Tang, Ying Nian Wu, Joseph Tighe, Yifan Xing, Stefano Soatto

In deep metric learning for visual recognition, the calibration of distance thresholds is crucial for achieving desired model performance in the true positive rates (TPR) or true negative rates (TNR).

Metric Learning Open Set Learning

Musketeer: Joint Training for Multi-task Vision Language Model with Task Explanation Prompts

1 code implementation11 May 2023 Zhaoyang Zhang, Yantao Shen, Kunyu Shi, Zhaowei Cai, Jun Fang, Siqi Deng, Hao Yang, Davide Modolo, Zhuowen Tu, Stefano Soatto

We present a vision-language model whose parameters are jointly trained on all tasks and fully shared among multiple heterogeneous tasks which may interfere with each other, resulting in a single model which we named Musketeer.

Language Modelling

SAFE: Machine Unlearning With Shard Graphs

no code implementations ICCV 2023 Yonatan Dukler, Benjamin Bowman, Alessandro Achille, Aditya Golatkar, Ashwin Swaminathan, Stefano Soatto

We present Synergy Aware Forgetting Ensemble (SAFE), a method to adapt large models on a diverse collection of data while minimizing the expected cost to remove the influence of training samples from the trained model.

Machine Unlearning

AI Model Disgorgement: Methods and Choices

no code implementations7 Apr 2023 Alessandro Achille, Michael Kearns, Carson Klingenberg, Stefano Soatto

One potential fix for training corpus data defects is model disgorgement -- the elimination of not just the improperly used data, but also the effects of improperly used data on any component of an ML model.

Train/Test-Time Adaptation with Retrieval

no code implementations CVPR 2023 Luca Zancato, Alessandro Achille, Tian Yu Liu, Matthew Trager, Pramuditha Perera, Stefano Soatto

Second, we apply ${\rm T^3AR}$ for test-time adaptation and show that exploiting a pool of external images at test-time leads to more robust representations over existing methods on DomainNet-126 and VISDA-C, especially when few adaptation data are available (up to 8%).

Retrieval Test-time Adaptation

Feature Tracks are not Zero-Mean Gaussian

no code implementations25 Mar 2023 Stephanie Tsuei, Wenjie Mo, Stefano Soatto

In state estimation algorithms that use feature tracks as input, it is customary to assume that the errors in feature track positions are zero-mean Gaussian.

Depth Estimation From Camera Image and mmWave Radar Point Cloud

no code implementations CVPR 2023 Akash Deep Singh, Yunhao Ba, Ankur Sarker, Howard Zhang, Achuta Kadambi, Stefano Soatto, Mani Srivastava, Alex Wong

To fuse radar depth with an image, we propose a gated fusion scheme that accounts for the confidence scores of the correspondence so that we selectively combine radar and camera embeddings to yield a dense depth map.

Depth Estimation

Guided Recommendation for Model Fine-Tuning

no code implementations CVPR 2023 Hao Li, Charless Fowlkes, Hao Yang, Onkar Dabeer, Zhuowen Tu, Stefano Soatto

With thousands of historical training jobs, a recommendation system can be learned to predict the model selection score given the features of the dataset and the model as input.

Model Selection Transfer Learning

Integral Continual Learning Along the Tangent Vector Field of Tasks

no code implementations23 Nov 2022 Tian Yu Liu, Aditya Golatkar, Stefano Soatto, Alessandro Achille

We propose a lightweight continual learning method which incorporates information from specialized datasets incrementally, by integrating it along the vector field of "generalist" models.

Continual Learning

Stain-invariant self supervised learning for histopathology image analysis

1 code implementation14 Nov 2022 Alexandre Tiard, Alex Wong, David Joon Ho, Yangchao Wu, Eliram Nof, Alvin C. Goh, Stefano Soatto, Saad Nadeem

Our method achieves the state-of-the-art performance on several publicly available breast cancer datasets ranging from tumor classification (CAMELYON17) and subtyping (BRACS) to HER2 status classification and treatment response prediction.

Classification Self-Supervised Learning

Critical Learning Periods for Multisensory Integration in Deep Networks

1 code implementation CVPR 2023 Michael Kleinman, Alessandro Achille, Stefano Soatto

We show that the ability of a neural network to integrate information from diverse sources hinges critically on being exposed to properly correlated signals during the early phases of training.

Semi-supervised Vision Transformers at Scale

1 code implementation11 Aug 2022 Zhaowei Cai, Avinash Ravichandran, Paolo Favaro, Manchen Wang, Davide Modolo, Rahul Bhotika, Zhuowen Tu, Stefano Soatto

We study semi-supervised learning (SSL) for vision transformers (ViT), an under-explored topic despite the wide adoption of the ViT architectures to different tasks.

Inductive Bias Semi-Supervised Image Classification

Masked Vision and Language Modeling for Multi-modal Representation Learning

no code implementations3 Aug 2022 Gukyeong Kwon, Zhaowei Cai, Avinash Ravichandran, Erhan Bas, Rahul Bhotika, Stefano Soatto

Instead of developing masked language modeling (MLM) and masked image modeling (MIM) independently, we propose to build joint masked vision and language modeling, where the masked signal of one modality is reconstructed with the help from another modality.

Language Modelling Masked Language Modeling +1

On the Learnability of Physical Concepts: Can a Neural Network Understand What's Real?

no code implementations25 Jul 2022 Alessandro Achille, Stefano Soatto

We revisit the classic signal-to-symbol barrier in light of the remarkable ability of deep neural networks to generate realistic synthetic data.

On Leave-One-Out Conditional Mutual Information For Generalization

no code implementations1 Jul 2022 Mohamad Rida Rammal, Alessandro Achille, Aditya Golatkar, Suhas Diggavi, Stefano Soatto

We derive information theoretic generalization bounds for supervised learning algorithms based on a new measure of leave-one-out conditional mutual information (loo-CMI).

Generalization Bounds Image Classification

Not Just Streaks: Towards Ground Truth for Single Image Deraining

1 code implementation22 Jun 2022 Yunhao Ba, Howard Zhang, Ethan Yang, Akira Suzuki, Arnold Pfahnl, Chethan Chinder Chandrappa, Celso de Melo, Suya You, Stefano Soatto, Alex Wong, Achuta Kadambi

We propose a large-scale dataset of real-world rainy and clean image pairs and a method to remove degradations, induced by rain streaks and rain accumulation, from the image.

Single Image Deraining

Gacs-Korner Common Information Variational Autoencoder

1 code implementation NeurIPS 2023 Michael Kleinman, Alessandro Achille, Stefano Soatto, Jonathan Kao

We propose a notion of common information that allows one to quantify and separate the information that is shared between two random variables from the information that is unique to each.

ELODI: Ensemble Logit Difference Inhibition for Positive-Congruent Training

no code implementations12 May 2022 Yue Zhao, Yantao Shen, Yuanjun Xiong, Shuo Yang, Wei Xia, Zhuowen Tu, Bernt Schiele, Stefano Soatto

We present a method to train a classification system that achieves paragon performance in both error rate and NFR, at the inference cost of a single model.

Graph Spectral Embedding using the Geodesic Betweeness Centrality

no code implementations7 May 2022 Shay Deutsch, Stefano Soatto

We introduce the Graph Sylvester Embedding (GSE), an unsupervised graph representation of local similarity, connectivity, and global structure.

X-DETR: A Versatile Architecture for Instance-wise Vision-Language Tasks

no code implementations12 Apr 2022 Zhaowei Cai, Gukyeong Kwon, Avinash Ravichandran, Erhan Bas, Zhuowen Tu, Rahul Bhotika, Stefano Soatto

In this paper, we study the challenging instance-wise vision-language tasks, where the free-form language is required to align with the objects instead of the whole image.

Class-Incremental Learning with Strong Pre-trained Models

1 code implementation CVPR 2022 Tz-Ying Wu, Gurumurthy Swaminathan, Zhizhong Li, Avinash Ravichandran, Nuno Vasconcelos, Rahul Bhotika, Stefano Soatto

We hypothesize that a strong base model can provide a good representation for novel classes and incremental learning can be done with small adaptations.

Class Incremental Learning Incremental Learning

MeMOT: Multi-Object Tracking with Memory

no code implementations CVPR 2022 Jiarui Cai, Mingze Xu, Wei Li, Yuanjun Xiong, Wei Xia, Zhuowen Tu, Stefano Soatto

We propose an online tracking algorithm that performs the object detection and data association under a common framework, capable of linking objects after a long time span.

Multi-Object Tracking Object +2

Towards Differential Relational Privacy and its use in Question Answering

no code implementations30 Mar 2022 Simone Bombari, Alessandro Achille, Zijian Wang, Yu-Xiang Wang, Yusheng Xie, Kunwar Yashraj Singh, Srikar Appalaraju, Vijay Mahadevan, Stefano Soatto

While bounding general memorization can have detrimental effects on the performance of a trained model, bounding RM does not prevent effective learning.

Memorization Question Answering

Task Adaptive Parameter Sharing for Multi-Task Learning

1 code implementation CVPR 2022 Matthew Wallingford, Hao Li, Alessandro Achille, Avinash Ravichandran, Charless Fowlkes, Rahul Bhotika, Stefano Soatto

TAPS solves a joint optimization problem which determines which layers to share with the base model and the value of the task-specific weights.

Multi-Task Learning

Omni-DETR: Omni-Supervised Object Detection with Transformers

1 code implementation CVPR 2022 Pei Wang, Zhaowei Cai, Hao Yang, Gurumurthy Swaminathan, Nuno Vasconcelos, Bernt Schiele, Stefano Soatto

This is enabled by a unified architecture, Omni-DETR, based on the recent progress on student-teacher framework and end-to-end transformer based object detection.

Object object-detection +2

On the Viability of Monocular Depth Pre-training for Semantic Segmentation

no code implementations26 Mar 2022 Dong Lao, Alex Wong, Samuel Lu, Stefano Soatto

We explore how pre-training a model to infer depth from a single image compares to pre-training the model for a semantic task, e. g. ImageNet classification, for the purpose of downstream transfer to semantic segmentation.

Image Classification Monocular Depth Estimation +2

Mixed Differential Privacy in Computer Vision

no code implementations CVPR 2022 Aditya Golatkar, Alessandro Achille, Yu-Xiang Wang, Aaron Roth, Michael Kearns, Stefano Soatto

AdaMix incorporates few-shot training, or cross-modal zero-shot learning, on public data prior to private fine-tuning, to improve the trade-off.

Zero-Shot Learning

Contrastive Neighborhood Alignment

no code implementations6 Jan 2022 Pengkai Zhu, Zhaowei Cai, Yuanjun Xiong, Zhuowen Tu, Luis Goncalves, Vijay Mahadevan, Stefano Soatto

We present Contrastive Neighborhood Alignment (CNA), a manifold learning approach to maintain the topology of learned features whereby data points that are mapped to nearby representations by the source (teacher) model are also mapped to neighbors by the target (student) model.

DIVA: Dataset Derivative of a Learning Task

no code implementations ICLR 2022 Yonatan Dukler, Alessandro Achille, Giovanni Paolini, Avinash Ravichandran, Marzia Polito, Stefano Soatto

A learning task is a function from a training set to the validation error, which can be represented by a trained deep neural network (DNN).

AutoML

Representation Consolidation from Multiple Expert Teachers

no code implementations29 Sep 2021 Zhizhong Li, Avinash Ravichandran, Charless Fowlkes, Marzia Polito, Rahul Bhotika, Stefano Soatto

Indeed, we observe experimentally that standard distillation of task-specific teachers, or using these teacher representations directly, **reduces** downstream transferability compared to a task-agnostic generalist model.

Knowledge Distillation

STRIC: Stacked Residuals of Interpretable Components for Time Series Anomaly Detection

no code implementations29 Sep 2021 Luca Zancato, Alessandro Achille, Giovanni Paolini, Alessandro Chiuso, Stefano Soatto

After modeling the signals, we use an anomaly detection system based on the classic CUMSUM algorithm and a variational approximation of the $f$-divergence to detect both isolated point anomalies and change-points in statistics of the signals.

Anomaly Detection Time Series +1

Small Lesion Segmentation in Brain MRIs with Subpixel Embedding

1 code implementation18 Sep 2021 Alex Wong, Allison Chen, Yangchao Wu, Safa Cicek, Alexandre Tiard, Byung-Woo Hong, Stefano Soatto

We propose a neural network architecture in the form of a standard encoder-decoder where predictions are guided by a spatial expansion embedding network.

Lesion Segmentation

Unsupervised Depth Completion with Calibrated Backprojection Layers

1 code implementation ICCV 2021 Alex Wong, Stefano Soatto

At inference time, the calibration of the camera, which can be different than the one used for training, is fed as an input to the network along with the sparse point cloud and a single image.

Depth Completion

Uniform Sampling over Episode Difficulty

2 code implementations NeurIPS 2021 Sébastien M. R. Arnold, Guneet S. Dhillon, Avinash Ravichandran, Stefano Soatto

Episodic training is a core ingredient of few-shot learning to train models on tasks with limited labelled data.

Few-Shot Learning

SABER: Data-Driven Motion Planner for Autonomously Navigating Heterogeneous Robots

1 code implementation3 Aug 2021 Alexander Schperberg, Stephanie Tsuei, Stefano Soatto, Dennis Hong

We present an end-to-end online motion planning framework that uses a data-driven approach to navigate a heterogeneous robot team towards a global goal while avoiding obstacles in uncertain environments.

Model Predictive Control Motion Planning +3

Representation Consolidation for Training Expert Students

no code implementations16 Jul 2021 Zhizhong Li, Avinash Ravichandran, Charless Fowlkes, Marzia Polito, Rahul Bhotika, Stefano Soatto

Traditionally, distillation has been used to train a student model to emulate the input/output functionality of a teacher.

Long Short-Term Transformer for Online Action Detection

2 code implementations NeurIPS 2021 Mingze Xu, Yuanjun Xiong, Hao Chen, Xinyu Li, Wei Xia, Zhuowen Tu, Stefano Soatto

We present Long Short-term TRansformer (LSTR), a temporal modeling algorithm for online action detection, which employs a long- and short-term memory mechanism to model prolonged sequence data.

Online Action Detection Playing the Game of 2048

Learning Hierarchical Graph Neural Networks for Image Clustering

2 code implementations ICCV 2021 Yifan Xing, Tong He, Tianjun Xiao, Yongxin Wang, Yuanjun Xiong, Wei Xia, David Wipf, Zheng Zhang, Stefano Soatto

Our hierarchical GNN uses a novel approach to merge connected components predicted at each level of the hierarchy to form a new graph at the next level.

Clustering Face Clustering

Dynamically Grown Generative Adversarial Networks

no code implementations16 Jun 2021 Lanlan Liu, Yuting Zhang, Jia Deng, Stefano Soatto

Recent work introduced progressive network growing as a promising way to ease the training for large GANs, but the model design and architecture-growing strategy still remain under-explored and needs manual design for different image data.

Image Generation

Harnessing Unrecognizable Faces for Improving Face Recognition

no code implementations8 Jun 2021 Siqi Deng, Yuanjun Xiong, Meng Wang, Wei Xia, Stefano Soatto

The common implementation of face recognition systems as a cascade of a detection stage and a recognition or verification stage can cause problems beyond failures of the detector.

Face Recognition Quantization

An Adaptive Framework for Learning Unsupervised Depth Completion

1 code implementation6 Jun 2021 Alex Wong, Xiaohan Fei, Byung-Woo Hong, Stefano Soatto

We present a method to infer a dense depth map from a color image and associated sparse depth measurements.

Depth Completion

Learning Topology from Synthetic Data for Unsupervised Depth Completion

1 code implementation6 Jun 2021 Alex Wong, Safa Cicek, Stefano Soatto

We present a method for inferring dense depth maps from images and sparse depth measurements by leveraging synthetic data to learn the association of sparse point clouds with dense natural shapes, and using the image as evidence to validate the predicted depth map.

Depth Completion

Compatibility-aware Heterogeneous Visual Search

no code implementations CVPR 2021 Rahul Duggal, Hao Zhou, Shuo Yang, Yuanjun Xiong, Wei Xia, Zhuowen Tu, Stefano Soatto

Existing systems use the same embedding model to compute representations (embeddings) for the query and gallery images.

Neural Architecture Search Retrieval

Visual Relationship Detection Using Part-and-Sum Transformers with Composite Queries

no code implementations ICCV 2021 Qi Dong, Zhuowen Tu, Haofu Liao, Yuting Zhang, Vijay Mahadevan, Stefano Soatto

Computer vision applications such as visual relationship detection and human object interaction can be formulated as a composite (structured) set detection problem in which both the parts (subject, object, and predicate) and the sum (triplet as a whole) are to be detected in a hierarchical fashion.

Human-Object Interaction Detection Object +2

Learning Semantic-Aware Dynamics for Video Prediction

no code implementations CVPR 2021 Xinzhu Bei, Yanchao Yang, Stefano Soatto

The appearance of the scene is warped from past frames using the predicted motion in co-visible regions; dis-occluded regions are synthesized with content-aware inpainting utilizing the predicted scene layout.

Optical Flow Estimation Video Prediction

Redundant Information Neural Estimation

no code implementations ICLR Workshop Neural_Compression 2021 Michael Kleinman, Alessandro Achille, Stefano Soatto, Jonathan Kao

We introduce the Redundant Information Neural Estimator (RINE), a method that allows efficient estimation for the component of information about a target variable that is common to a set of sources, previously referred to as the “redundant information.” We show that existing definitions of the redundant information can be recast in terms of an optimization over a family of deterministic or stochastic functions.

Image Classification

A linearized framework and a new benchmark for model selection for fine-tuning

no code implementations29 Jan 2021 Aditya Deshpande, Alessandro Achille, Avinash Ravichandran, Hao Li, Luca Zancato, Charless Fowlkes, Rahul Bhotika, Stefano Soatto, Pietro Perona

Since all model selection algorithms in the literature have been tested on different use-cases and never compared directly, we introduce a new comprehensive benchmark for model selection comprising of: i) A model zoo of single and multi-domain models, and ii) Many target tasks.

Feature Correlation Model Selection

Supervised Momentum Contrastive Learning for Few-Shot Classification

no code implementations26 Jan 2021 Orchid Majumder, Avinash Ravichandran, Subhransu Maji, Alessandro Achille, Marzia Polito, Stefano Soatto

In this work we investigate the complementary roles of these two sources of information by combining instance-discriminative contrastive learning and supervised learning in a single framework called Supervised Momentum Contrastive learning (SUPMOCO).

Classification Contrastive Learning +4

Exponential Moving Average Normalization for Self-supervised and Semi-supervised Learning

1 code implementation CVPR 2021 Zhaowei Cai, Avinash Ravichandran, Subhransu Maji, Charless Fowlkes, Zhuowen Tu, Stefano Soatto

We present a plug-in replacement for batch normalization (BN) called exponential moving average normalization (EMAN), which improves the performance of existing student-teacher based self- and semi-supervised learning techniques.

Self-Supervised Learning Semi-Supervised Image Classification

Estimating informativeness of samples with Smooth Unique Information

1 code implementation ICLR 2021 Hrayr Harutyunyan, Alessandro Achille, Giovanni Paolini, Orchid Majumder, Avinash Ravichandran, Rahul Bhotika, Stefano Soatto

We define a notion of information that an individual sample provides to the training of a neural network, and we specialize it to measure both how much a sample informs the final weights and how much it informs the function computed by the weights.

Informativeness

Structured Prediction as Translation between Augmented Natural Languages

1 code implementation ICLR 2021 Giovanni Paolini, Ben Athiwaratkun, Jason Krone, Jie Ma, Alessandro Achille, Rishita Anubhai, Cicero Nogueira dos santos, Bing Xiang, Stefano Soatto

We propose a new framework, Translation between Augmented Natural Languages (TANL), to solve many structured prediction language tasks including joint entity and relation extraction, nested named entity recognition, relation classification, semantic role labeling, event extraction, coreference resolution, and dialogue state tracking.

coreference-resolution Dialogue State Tracking +11

Mixed-Privacy Forgetting in Deep Networks

no code implementations CVPR 2021 Aditya Golatkar, Alessandro Achille, Avinash Ravichandran, Marzia Polito, Stefano Soatto

We show that the influence of a subset of the training samples can be removed -- or "forgotten" -- from the weights of a network trained on large-scale image classification tasks, and we provide strong computable bounds on the amount of remaining information after forgetting.

Image Classification

LQF: Linear Quadratic Fine-Tuning

no code implementations CVPR 2021 Alessandro Achille, Aditya Golatkar, Avinash Ravichandran, Marzia Polito, Stefano Soatto

Classifiers that are linear in their parameters, and trained by optimizing a convex loss function, have predictable behavior with respect to changes in the training data, initial conditions, and optimization.

Image Classification

Positive-Congruent Training: Towards Regression-Free Model Updates

no code implementations CVPR 2021 Sijie Yan, Yuanjun Xiong, Kaustav Kundu, Shuo Yang, Siqi Deng, Meng Wang, Wei Xia, Stefano Soatto

Reducing inconsistencies in the behavior of different versions of an AI system can be as important in practice as reducing its overall error.

Image Classification regression

Spectral Embedding of Graph Networks

no code implementations30 Sep 2020 Shay Deutsch, Stefano Soatto

We introduce an unsupervised graph embedding that trades off local node similarity and connectivity, and global structure.

Graph Embedding Node Classification

Stereopagnosia: Fooling Stereo Networks with Adversarial Perturbations

1 code implementation21 Sep 2020 Alex Wong, Mukund Mundhra, Stefano Soatto

We study the effect of adversarial perturbations of images on the estimates of disparity by deep learning models trained for stereo.

Adversarial Attack Adversarial Defense +3

Predicting Training Time Without Training

no code implementations NeurIPS 2020 Luca Zancato, Alessandro Achille, Avinash Ravichandran, Rahul Bhotika, Stefano Soatto

We tackle the problem of predicting the number of optimization steps that a pre-trained deep network needs to converge to a given value of the loss function.

Risk-Averse MPC via Visual-Inertial Input and Recurrent Networks for Online Collision Avoidance

no code implementations28 Jul 2020 Alexander Schperberg, Kenny Chen, Stephanie Tsuei, Michael Jewett, Joshua Hooks, Stefano Soatto, Ankur Mehta, Dennis Hong

In this paper, we propose an online path planning architecture that extends the model predictive control (MPC) formulation to consider future location uncertainties for safer navigation through cluttered environments.

Collision Avoidance Model Predictive Control +2

Stochastic batch size for adaptive regularization in deep network optimization

no code implementations14 Apr 2020 Kensuke Nakamura, Stefano Soatto, Byung-Woo Hong

We propose a first-order stochastic optimization algorithm incorporating adaptive regularization applicable to machine learning problems in deep learning framework.

Image Classification Stochastic Optimization

FDA: Fourier Domain Adaptation for Semantic Segmentation

3 code implementations CVPR 2020 Yanchao Yang, Stefano Soatto

We describe a simple method for unsupervised domain adaptation, whereby the discrepancy between the source and target distributions is reduced by swapping the low-frequency spectrum of one with the other.

Segmentation Semantic Segmentation +1

Learning to Manipulate Individual Objects in an Image

1 code implementation CVPR 2020 Yanchao Yang, Yutong Chen, Stefano Soatto

We describe a method to train a generative model with latent factors that are (approximately) independent and localized.

Disentanglement

Phase Consistent Ecological Domain Adaptation

1 code implementation CVPR 2020 Yanchao Yang, Dong Lao, Ganesh Sundaramoorthi, Stefano Soatto

We introduce two criteria to regularize the optimization involved in learning a classifier in a domain where no annotated data are available, leveraging annotated data in a different domain, a problem known as unsupervised domain adaptation.

Segmentation Semantic Segmentation +1

Towards Backward-Compatible Representation Learning

3 code implementations CVPR 2020 Yantao Shen, Yuanjun Xiong, Wei Xia, Stefano Soatto

Backward compatibility is critical to quickly deploy new embedding models that leverage ever-growing large-scale training datasets and improvements in deep learning architectures and training methods.

Face Recognition Representation Learning

Forgetting Outside the Box: Scrubbing Deep Networks of Information Accessible from Input-Output Observations

1 code implementation ECCV 2020 Aditya Golatkar, Alessandro Achille, Stefano Soatto

We describe a procedure for removing dependency on a cohort of training data from a trained deep network that improves upon and generalizes previous methods to different readout functions and can be extended to ensure forgetting in the activations of the network.

Rethinking the Hyperparameters for Fine-tuning

1 code implementation ICLR 2020 Hao Li, Pratik Chaudhari, Hao Yang, Michael Lam, Avinash Ravichandran, Rahul Bhotika, Stefano Soatto

Our findings challenge common practices of fine-tuning and encourages deep learning practitioners to rethink the hyperparameters for fine-tuning.

Transfer Learning

Multi-Task Incremental Learning for Object Detection

no code implementations13 Feb 2020 Xialei Liu, Hao Yang, Avinash Ravichandran, Rahul Bhotika, Stefano Soatto

For the difficult cases, where the domain gaps and especially category differences are large, we explore three different exemplar sampling methods and show the proposed adaptive sampling method is effective to select diverse and informative samples from entire datasets, to further prevent forgetting.

Incremental Learning Object +2

Incremental Meta-Learning via Indirect Discriminant Alignment

no code implementations11 Feb 2020 Qing Liu, Orchid Majumder, Alessandro Achille, Avinash Ravichandran, Rahul Bhotika, Stefano Soatto

Majority of the modern meta-learning methods for few-shot classification tasks operate in two phases: a meta-training phase where the meta-learner learns a generic representation by solving multiple few-shot tasks sampled from a large dataset and a testing phase, where the meta-learner leverages its learnt internal representation for a specific few-shot task involving classes which were not seen during the meta-training phase.

Incremental Learning Meta-Learning

SAM: Squeeze-and-Mimic Networks for Conditional Visual Driving Policy Learning

1 code implementation6 Dec 2019 Albert Zhao, Tong He, Yitao Liang, Haibin Huang, Guy Van Den Broeck, Stefano Soatto

To learn this representation, we train a squeeze network to drive using annotations for the side task as input.

Semantic Segmentation

Eternal Sunshine of the Spotless Net: Selective Forgetting in Deep Networks

2 code implementations CVPR 2020 Aditya Golatkar, Alessandro Achille, Stefano Soatto

We explore the problem of selectively forgetting a particular subset of the data used for training a deep neural network.

Meta-Q-Learning

2 code implementations ICLR 2020 Rasool Fakoor, Pratik Chaudhari, Stefano Soatto, Alexander J. Smola

This paper introduces Meta-Q-Learning (MQL), a new off-policy algorithm for meta-Reinforcement Learning (meta-RL).

Continuous Control Meta Reinforcement Learning +1

Where is the Information in a Deep Network?

no code implementations25 Sep 2019 Alessandro Achille, Stefano Soatto

We relate this to the Information in the Weights, and use this result to show that models of low (information) complexity not only generalize better, but are bound to learn invariant representations of future inputs.

Open-Ended Question Answering

A Baseline for Few-Shot Image Classification

3 code implementations ICLR 2020 Guneet S. Dhillon, Pratik Chaudhari, Avinash Ravichandran, Stefano Soatto

When fine-tuned transductively, this outperforms the current state-of-the-art on standard datasets such as Mini-ImageNet, Tiered-ImageNet, CIFAR-FS and FC-100 with the same hyper-parameters.

Classification Few-Shot Image Classification +2

Toward Understanding Catastrophic Forgetting in Continual Learning

no code implementations2 Aug 2019 Cuong V. Nguyen, Alessandro Achille, Michael Lam, Tal Hassner, Vijay Mahadevan, Stefano Soatto

As an application, we apply our procedure to study two properties of a task sequence: (1) total complexity and (2) sequential heterogeneity.

Continual Learning

Time Matters in Regularizing Deep Networks: Weight Decay and Data Augmentation Affect Early Learning Dynamics, Matter Little Near Convergence

no code implementations NeurIPS 2019 Aditya Golatkar, Alessandro Achille, Stefano Soatto

Deep neural networks (DNNs), however, challenge this view: We show that removing regularization after an initial transient period has little effect on generalization, even if the final loss landscape is the same as if there had been no regularization.

Data Augmentation

Where is the Information in a Deep Neural Network?

no code implementations29 May 2019 Alessandro Achille, Giovanni Paolini, Stefano Soatto

We establish a novel relation between the information in the weights and the effective information in the activations, and use this result to show that models with low (information) complexity not only generalize better, but are bound to learn invariant representations of future inputs.

Inductive Bias Open-Ended Question Answering

Unsupervised Domain Adaptation via Regularized Conditional Alignment

no code implementations ICCV 2019 Safa Cicek, Stefano Soatto

We propose a method for unsupervised domain adaptation that trains a shared embedding to align the joint distributions of inputs (domain) and outputs (classes), making any classifier agnostic to the domain.

Unsupervised Domain Adaptation

Unsupervised Depth Completion from Visual Inertial Odometry

2 code implementations15 May 2019 Alex Wong, Xiaohan Fei, Stephanie Tsuei, Stefano Soatto

Our method first constructs a piecewise planar scaffolding of the scene, and then uses it to infer dense depth using the image along with the sparse points.

Depth Completion

Few-Shot Learning with Embedded Class Models and Shot-Free Meta Training

no code implementations ICCV 2019 Avinash Ravichandran, Rahul Bhotika, Stefano Soatto

We propose a method for learning embeddings for few-shot learning that is suitable for use with any number of ways and any number of shots (shot-free).

Few-Shot Learning Metric Learning

Critical Learning Periods in Deep Networks

no code implementations ICLR 2019 Alessandro Achille, Matteo Rovere, Stefano Soatto

Deficits that do not affect low-level statistics, such as vertical flipping of the images, have no lasting effect on performance and can be overcome with further training.

Disentanglement

Meta-Learning with Differentiable Convex Optimization

7 code implementations CVPR 2019 Kwonjoon Lee, Subhransu Maji, Avinash Ravichandran, Stefano Soatto

We propose to use these predictors as base learners to learn representations for few-shot learning and show they offer better tradeoffs between feature size and performance across a range of few-shot recognition benchmarks.

Few-Shot Image Classification Few-Shot Learning

The Information Complexity of Learning Tasks, their Structure and their Distance

no code implementations5 Apr 2019 Alessandro Achille, Giovanni Paolini, Glen Mbeng, Stefano Soatto

Our framework is the first to measure complexity in a way that accounts for the effect of the optimization scheme, which is critical in Deep Learning.

Memorization Transfer Learning

Zero Shot Learning with the Isoperimetric Loss

no code implementations15 Mar 2019 Shay Deutsch, Andrea Bertozzi, Stefano Soatto

We introduce the isoperimetric loss as a regularization criterion for learning the map from a visual representation to a semantic embedding, to be used to transfer knowledge to unknown classes in a zero-shot learning setting.

Zero-Shot Learning

Task2Vec: Task Embedding for Meta-Learning

1 code implementation ICCV 2019 Alessandro Achille, Michael Lam, Rahul Tewari, Avinash Ravichandran, Subhransu Maji, Charless Fowlkes, Stefano Soatto, Pietro Perona

We demonstrate that this embedding is capable of predicting task similarities that match our intuition about semantic and taxonomic relations between different visual tasks (e. g., tasks based on classifying different types of plants are similar) We also demonstrate the practical value of this framework for the meta-task of selecting a pre-trained feature extractor for a new task.

Meta-Learning

Dense Depth Posterior (DDP) from Single Image and Sparse Range

no code implementations CVPR 2019 Yanchao Yang, Alex Wong, Stefano Soatto

We present a deep learning system to infer the posterior distribution of a dense depth map associated with an image, by exploiting sparse range measurements, for instance from a lidar.

Depth Completion

Mono3D++: Monocular 3D Vehicle Detection with Two-Scale 3D Hypotheses and Task Priors

no code implementations11 Jan 2019 Tong He, Stefano Soatto

We present a method to infer 3D pose and shape of vehicles from a single image.

Dynamics and Reachability of Learning Tasks

no code implementations4 Oct 2018 Alessandro Achille, Glen Mbeng, Stefano Soatto

We compute the transition probability between two learning tasks, and show that it decomposes into two factors.

Semantic Textual Similarity Transfer Learning

Geo-Supervised Visual Depth Prediction

2 code implementations30 Jul 2018 Xiaohan Fei, Alex Wong, Stefano Soatto

We propose using global orientation from inertial measurements, and the bias it induces on the shape of objects populating the scene, to inform visual 3D reconstruction.

3D Reconstruction Depth Estimation +1

Conditional Prior Networks for Optical Flow

1 code implementation ECCV 2018 Yanchao Yang, Stefano Soatto

On the other hand, fully supervised methods learn the regularity in the annotated data, without explicit regularization and with the risk of overfitting.

Optical Flow Estimation

Visual-Inertial Object Detection and Mapping

no code implementations ECCV 2018 Xiaohan Fei, Stefano Soatto

We present a method to populate an unknown environment with models of previously seen objects, placed in a Euclidean reference frame that is inferred causally and on-line using monocular video along with inertial sensors.

Object object-detection +1

Empirical Study of the Topology and Geometry of Deep Networks

no code implementations CVPR 2018 Alhussein Fawzi, Seyed-Mohsen Moosavi-Dezfooli, Pascal Frossard, Stefano Soatto

We specifically study the topology of classification regions created by deep networks, as well as their associated decision boundary.

General Classification

Input and Weight Space Smoothing for Semi-supervised Learning

no code implementations23 May 2018 Safa Cicek, Stefano Soatto

We propose regularizing the empirical loss for semi-supervised learning by acting on both the input (data) space, and the weight (parameter) space.

Data Augmentation

SaaS: Speed as a Supervisor for Semi-supervised Learning

1 code implementation ECCV 2018 Safa Cicek, Alhussein Fawzi, Stefano Soatto

We introduce the SaaS Algorithm for semi-supervised learning, which uses learning speed during stochastic gradient descent in a deep neural network to measure the quality of an iterative estimate of the posterior probability of unknown labels.

OATM: Occlusion Aware Template Matching by Consensus Set Maximization

no code implementations CVPR 2018 Simon Korman, Mark Milam, Stefano Soatto

We present a novel approach to template matching that is efficient, can handle partial occlusions, and comes with provable performance guarantees.

Template Matching

Mathematics of Deep Learning

no code implementations13 Dec 2017 Rene Vidal, Joan Bruna, Raja Giryes, Stefano Soatto

Recently there has been a dramatic increase in the performance of recognition systems due to the introduction of deep architectures for representation learning and classification.

General Classification Representation Learning

DeepRadiologyNet: Radiologist Level Pathology Detection in CT Head Images

no code implementations26 Nov 2017 Jameson Merkow, Robert Lufkin, Kim Nguyen, Stefano Soatto, Zhuowen Tu, Andrea Vedaldi

Thus, DeepRadiologyNet enables significant reduction in the workload of human radiologists by automatically filtering studies and reporting on the high-confidence ones at an operating point well below the literal error rate for US Board Certified radiologists, estimated at 0. 82%.

Critical Learning Periods in Deep Neural Networks

1 code implementation24 Nov 2017 Alessandro Achille, Matteo Rovere, Stefano Soatto

Deficits that do not affect low-level statistics, such as vertical flipping of the images, have no lasting effect on performance and can be overcome with further training.

Disentanglement

Block-Cyclic Stochastic Coordinate Descent for Deep Neural Networks

no code implementations20 Nov 2017 Kensuke Nakamura, Stefano Soatto, Byung-Woo Hong

We present a stochastic first-order optimization algorithm, named BCSC, that adds a cyclic constraint to stochastic block-coordinate descent.

A Separation Principle for Control in the Age of Deep Learning

no code implementations9 Nov 2017 Alessandro Achille, Stefano Soatto

Again this can be finitely-parametrized using a deep neural network, and already some applications are beginning to emerge.

Parle: parallelizing stochastic gradient descent

no code implementations3 Jul 2017 Pratik Chaudhari, Carlo Baldassi, Riccardo Zecchina, Stefano Soatto, Ameet Talwalkar, Adam Oberman

We propose a new algorithm called Parle for parallel training of deep networks that converges 2-4x faster than a data-parallel implementation of SGD, while achieving significantly improved error rates that are nearly state-of-the-art on several benchmarks including CIFAR-10 and CIFAR-100, without introducing any additional hyper-parameters.

Visual-Inertial-Semantic Scene Representation for 3D Object Detection

no code implementations CVPR 2017 Jingming Dong, Xiaohan Fei, Stefano Soatto

We describe a system to detect objects in three-dimensional space using video and inertial sensors (accelerometer and gyrometer), ubiquitous in modern mobile platforms from phones to drones.

3D Object Detection object-detection

Zero Shot Learning via Multi-Scale Manifold Regularization

no code implementations CVPR 2017 Shay Deutsch, Soheil Kolouri, Kyungnam Kim, Yuri Owechko, Stefano Soatto

We address zero-shot learning using a new manifold alignment framework based on a localized multi-scale transform on graphs.

Zero-Shot Learning

S2F: Slow-To-Fast Interpolator Flow

no code implementations CVPR 2017 Yanchao Yang, Stefano Soatto

We introduce a method to compute optical flow at multiple scales of motion, without resorting to multi- resolution or combinatorial methods.

Optical Flow Estimation

Emergence of Invariance and Disentanglement in Deep Representations

no code implementations5 Jun 2017 Alessandro Achille, Stefano Soatto

Using established principles from Statistics and Information Theory, we show that invariance to nuisance factors in a deep neural network is equivalent to information minimality of the learned representation, and that stacking layers and injecting noise during training naturally bias the network towards learning invariant representations.

Disentanglement

Classification regions of deep neural networks

no code implementations26 May 2017 Alhussein Fawzi, Seyed-Mohsen Moosavi-Dezfooli, Pascal Frossard, Stefano Soatto

The goal of this paper is to analyze the geometric properties of deep neural network classifiers in the input space.

Classification General Classification

Robustness of classifiers to universal perturbations: a geometric perspective

no code implementations ICLR 2018 Seyed-Mohsen Moosavi-Dezfooli, Alhussein Fawzi, Omar Fawzi, Pascal Frossard, Stefano Soatto

Deep networks have recently been shown to be vulnerable to universal perturbations: there exist very small image-agnostic perturbations that cause most natural images to be misclassified by such classifiers.

Adaptive Regularization of Some Inverse Problems in Image Analysis

no code implementations9 May 2017 Byung-Woo Hong, Ja-Keoung Koo, Martin Burger, Stefano Soatto

We present an adaptive regularization scheme for optimizing composite energy functionals arising in image analysis problems.

Denoising Motion Estimation

Deep Relaxation: partial differential equations for optimizing deep neural networks

no code implementations17 Apr 2017 Pratik Chaudhari, Adam Oberman, Stanley Osher, Stefano Soatto, Guillaume Carlier

In this paper we establish a connection between non-convex optimization methods for training deep neural networks and nonlinear partial differential equations (PDEs).

Multi-Label Segmentation via Residual-Driven Adaptive Regularization

no code implementations27 Feb 2017 Byung-Woo Hong, Ja-Keoung Koo, Stefano Soatto

We present a variational multi-label segmentation algorithm based on a robust Huber loss for both the data and the regularizer, minimized within a convex optimization framework.

Entropy-SGD: Biasing Gradient Descent Into Wide Valleys

2 code implementations6 Nov 2016 Pratik Chaudhari, Anna Choromanska, Stefano Soatto, Yann Lecun, Carlo Baldassi, Christian Borgs, Jennifer Chayes, Levent Sagun, Riccardo Zecchina

This paper proposes a new optimization algorithm called Entropy-SGD for training deep neural networks that is motivated by the local geometry of the energy landscape.

Information Dropout: Learning Optimal Representations Through Noisy Computation

1 code implementation4 Nov 2016 Alessandro Achille, Stefano Soatto

The cross-entropy loss commonly used in deep learning is closely related to the defining properties of optimal representations, but does not enforce some of the key properties.

Representation Learning Variational Inference

ShapeFit and ShapeKick for Robust, Scalable Structure from Motion

no code implementations7 Aug 2016 Thomas Goldstein, Paul Hand, Choongbum Lee, Vladislav Voroninski, Stefano Soatto

We introduce a new method for location recovery from pair-wise directions that leverages an efficient convex program that comes with exact recovery guarantees, even in the presence of adversarial outliers.

Visual-Inertial-Semantic Scene Representation for 3-D Object Detection

no code implementations13 Jun 2016 Jingming Dong, Xiaohan Fei, Stefano Soatto

We describe a system to detect objects in three-dimensional space using video and inertial sensors (accelerometer and gyrometer), ubiquitous in modern mobile platforms from phones to drones.

object-detection Object Detection

A Theory of Local Matching: SIFT and Beyond

no code implementations19 Jan 2016 Hossein Mobahi, Stefano Soatto

Can it suggest new algorithms with reduced computational complexity or new descriptors with better accuracy for matching?

Self-Occlusions and Disocclusions in Causal Video Object Segmentation

no code implementations ICCV 2015 Yanchao Yang, Ganesh Sundaramoorthi, Stefano Soatto

We propose a method to detect disocclusion in video sequences of three-dimensional scenes and to partition the disoccluded regions into objects, defined by coherent deformation corresponding to surfaces in the scene.

Object Semantic Segmentation +2

On the energy landscape of deep networks

no code implementations20 Nov 2015 Pratik Chaudhari, Stefano Soatto

Specifically, we show that a regularization term akin to a magnetic field can be modulated with a single scalar parameter to transition the loss function from a complex, non-convex landscape with exponentially many local minima, to a phase with a polynomial number of minima, all the way down to a trivial landscape with a unique minimum.

A Simple Hierarchical Pooling Data Structure for Loop Closure

no code implementations20 Nov 2015 Xiaohan Fei, Konstantine Tsotsos, Stefano Soatto

We propose a data structure obtained by hierarchically averaging bag-of-word descriptors during a sequence of views that achieves average speedups in large-scale loop closure applications ranging from 4 to 20 times on benchmark datasets.

Efficient Minimal-Surface Regularization of Perspective Depth Maps in Variational Stereo

no code implementations CVPR 2015 Gottfried Graber, Jonathan Balzer, Stefano Soatto, Thomas Pock

We propose a method for dense three-dimensional surface reconstruction that leverages the strengths of shape-based approaches, by imposing regularization that respects the geometry of the surface, and the strength of depth-map-based stereo, by avoiding costly computation of surface topology.

Surface Reconstruction

Causal Video Object Segmentation From Persistence of Occlusions

no code implementations CVPR 2015 Brian Taylor, Vasiliy Karasev, Stefano Soatto

Occlusion relations inform the partition of the image domain into ``objects'' but are difficult to determine from a single image or short-baseline video.

Object Semantic Segmentation +2

Multi-View Feature Engineering and Learning

no code implementations CVPR 2015 Jingming Dong, Nikolaos Karianakis, Damek Davis, Joshua Hernandez, Jonathan Balzer, Stefano Soatto

We frame the problem of local representation of imaging data as the computation of minimal sufficient statistics that are invariant to nuisance variability induced by viewpoint and illumination.

Feature Engineering

Texture Representations for Image and Video Synthesis

no code implementations CVPR 2015 Georgios Georgiadis, Alessandro Chiuso, Stefano Soatto

In texture synthesis and classification, algorithms require a small texture to be provided as an input, which is assumed to be representative of a larger region to be re-synthesized or categorized.

General Classification Texture Synthesis

An Empirical Evaluation of Current Convolutional Architectures' Ability to Manage Nuisance Location and Scale Variability

no code implementations CVPR 2016 Nikolaos Karianakis, Jingming Dong, Stefano Soatto

We conduct an empirical study to test the ability of Convolutional Neural Networks (CNNs) to reduce the effects of nuisance transformations of the input data, such as location, scale and aspect ratio.

General Classification

Boosting Convolutional Features for Robust Object Proposals

no code implementations21 Mar 2015 Nikolaos Karianakis, Thomas J. Fuchs, Stefano Soatto

Modern detection algorithms like Regions with CNNs (Girshick et al., 2014) rely on Selective Search (Uijlings et al., 2013) to propose regions which with high probability represent objects, where in turn CNNs are deployed for classification.

General Classification Image Classification +4

Domain-Size Pooling in Local Descriptors: DSP-SIFT

no code implementations CVPR 2015 Jingming Dong, Stefano Soatto

We introduce a simple modification of local image descriptors, such as SIFT, based on pooling gradient orientations across different domain sizes, in addition to spatial locations.

Visual Scene Representations: Contrast, Scaling and Occlusion

no code implementations20 Dec 2014 Stefano Soatto, Jingming Dong, Nikolaos Karianakis

We study the structure of representations, defined as approximations of minimal sufficient statistics that are maximal invariants to nuisance factors, for visual data subject to scaling and occlusion of line-of-sight.

Two-sample testing

Visual Representations: Defining Properties and Deep Approximations

no code implementations27 Nov 2014 Stefano Soatto, Alessandro Chiuso

Visual representations are defined in terms of minimal sufficient statistics of visual data, for a class of tasks, that are also invariant to nuisance variability.

Cavlectometry: Towards Holistic Reconstruction of Large Mirror Objects

no code implementations14 Sep 2014 Jonathan Balzer, Daniel Acevedo-Feliz, Stefano Soatto, Sebastian Höfer, Markus Hadwiger, Jürgen Beyerer

We introduce a method based on the deflectometry principle for the reconstruction of specular objects exhibiting significant size and geometric complexity.

Surface Reconstruction

Second-Order Shape Optimization for Geometric Inverse Problems in Vision

no code implementations CVPR 2014 Jonathan Balzer, Stefano Soatto

We develop a method for optimization in shape spaces, i. e., sets of surfaces modulo re-parametrization.

Active Frame, Location, and Detector Selection for Automated and Manual Video Annotation

no code implementations CVPR 2014 Vasiliy Karasev, Avinash Ravichandran, Stefano Soatto

We describe an information-driven active selection approach to determine which detectors to deploy at which location in which frame of a video to minimize semantic class label uncertainty at every pixel, with the smallest computational cost that ensures a given uncertainty bound.

Asymmetric Sparse Kernel Approximations for Large-scale Visual Search

no code implementations CVPR 2014 Damek Davis, Jonathan Balzer, Stefano Soatto

We introduce an asymmetric sparse approximate embedding optimized for fast kernel comparison operations arising in large-scale visual search.

Image Retrieval Retrieval

On the Design and Analysis of Multiple View Descriptors

no code implementations23 Nov 2013 Jingming Dong, Jonathan Balzer, Damek Davis, Joshua Hernandez, Stefano Soatto

We propose an extension of popular descriptors based on gradient orientation histograms (HOG, computed in a single image) to multiple views.

Specificity

Nonlinearly Constrained MRFs: Exploring the Intrinsic Dimensions of Higher-Order Cliques

no code implementations CVPR 2013 Yun Zeng, Chaohui Wang, Stefano Soatto, Shing-Tung Yau

This paper introduces an efficient approach to integrating non-local statistics into the higher-order Markov Random Fields (MRFs) framework.

Image Segmentation Semantic Segmentation

Controlled Recognition Bounds for Visual Learning and Exploration

no code implementations NeurIPS 2012 Vasiliy Karasev, Alessandro Chiuso, Stefano Soatto

We describe the tradeoff between the performance in a visual recognition problem and the control authority that the agent can exercise on the sensing process.

Multiple Instance Filtering

no code implementations NeurIPS 2011 Kamil A. Wnuk, Stefano Soatto

We propose a robust filtering approach based on semi-supervised and multiple instance learning (MIL).

Multiple Instance Learning Visual Tracking

Steps Towards a Theory of Visual Information: Active Perception, Signal-to-Symbol Conversion and the Interplay Between Sensing and Control

no code implementations10 Oct 2011 Stefano Soatto

The concept of Actionable Information is described, that relates to a notion of information championed by J. Gibson, and a notion of "complete information" that relates to the minimal sufficient statistics of a complete representation.

Cannot find the paper you are looking for? You can Submit a new open access paper.