Search Results for author: Vicente Ordonez

Found 48 papers, 30 papers with code

PropTest: Automatic Property Testing for Improved Visual Programming

no code implementations • 25 Mar 2024 • Jaywon Koo, Ziyan Yang, Paola Cascante-Bonilla, Baishakhi Ray, Vicente Ordonez

Visual Programming has emerged as an alternative to end-to-end black-box visual reasoning models.

Question Answering Referring Expression +3

Paper
Add Code

Learning from Models and Data for Visual Grounding

no code implementations • 20 Mar 2024 • Ruozhen He, Paola Cascante-Bonilla, Ziyan Yang, Alexander C. Berg, Vicente Ordonez

We introduce SynGround, a novel framework that combines data-driven learning and knowledge transfer from various large-scale pretrained models to enhance the visual grounding capabilities of a pretrained vision-and-language model.

Language Modelling Large Language Model +2

Paper
Add Code

Grounding Language Models for Visual Entity Recognition

1 code implementation • 28 Feb 2024 • Zilin Xiao, Ming Gong, Paola Cascante-Bonilla, Xingyao Zhang, Jie Wu, Vicente Ordonez

We introduce AutoVER, an Autoregressive model for Visual Entity Recognition.

Language Modelling Large Language Model +2

Paper
Code

Improved Visual Grounding through Self-Consistent Explanations

no code implementations • 7 Dec 2023 • Ruozhen He, Paola Cascante-Bonilla, Ziyan Yang, Alexander C. Berg, Vicente Ordonez

Vision-and-language models trained to match images with text can be combined with visual explanation methods to point to the locations of specific objects in an image.

Language Modelling Large Language Model +1

Paper
Add Code

ElasticDiffusion: Training-free Arbitrary Size Image Generation through Global-Local Content Separation

1 code implementation • 30 Nov 2023 • Moayed Haji-Ali, Guha Balakrishnan, Vicente Ordonez

We propose ElasticDiffusion, a novel training-free decoding method that enables pretrained text-to-image diffusion models to generate images with various sizes.

Image Generation

116

Paper
Code

Characterizing Video Question Answering with Sparsified Inputs

no code implementations • 27 Nov 2023 • Shiyuan Huang, Robinson Piramuthu, Vicente Ordonez, Shih-Fu Chang, Gunnar A. Sigurdsson

From our experiments, we have observed only 5. 2%-5. 8% loss of performance with only 10% of video lengths, which corresponds to 2-4 frames selected from each video.

Question Answering Video Question Answering

Paper
Add Code

SCoRD: Subject-Conditional Relation Detection with Text-Augmented Data

1 code implementation • 24 Aug 2023 • Ziyan Yang, Kushal Kafle, Zhe Lin, Scott Cohen, Zhihong Ding, Vicente Ordonez

To solve this problem, we propose an auto-regressive model that given a subject, it predicts its relations, objects, and object locations by casting this output as a sequence of tokens.

Object Relation

Paper
Code

Going Beyond Nouns With Vision & Language Models Using Synthetic Data

1 code implementation • ICCV 2023 • Paola Cascante-Bonilla, Khaled Shehada, James Seale Smith, Sivan Doveh, Donghyun Kim, Rameswar Panda, Gül Varol, Aude Oliva, Vicente Ordonez, Rogerio Feris, Leonid Karlinsky

We contribute Synthetic Visual Concepts (SyViC) - a million-scale synthetic dataset and data generation codebase allowing to generate additional suitable data to improve VLC understanding and compositional reasoning of VL models.

Ranked #68 on Visual Reasoning on Winoground

Sentence Visual Reasoning

Paper
Code

ViC-MAE: Self-Supervised Representation Learning from Images and Video with Contrastive Masked Autoencoders

no code implementations • 21 Mar 2023 • Jefferson Hernandez, Ruben Villegas, Vicente Ordonez

We show that visual representations learned under ViC-MAE generalize well to both video and image classification tasks.

Ranked #4 on Image Classification on Places365 (using extra training data)

Action Classification Action Recognition +5

Paper
Add Code

Variation of Gender Biases in Visual Recognition Models Before and After Finetuning

no code implementations • 14 Mar 2023 • Jaspreet Ranjit, Tianlu Wang, Baishakhi Ray, Vicente Ordonez

We also find that (2) models finetuned on larger scale datasets are more likely to introduce new biased associations.

Object Recognition

Paper
Add Code

On the Transferability of Visual Features in Generalized Zero-Shot Learning

1 code implementation • 22 Nov 2022 • Paola Cascante-Bonilla, Leonid Karlinsky, James Seale Smith, Yanjun Qi, Vicente Ordonez

Generalized Zero-Shot Learning (GZSL) aims to train a classifier that can generalize to unseen classes, using a set of attributes as auxiliary information, and the visual features extracted from a pre-trained convolutional neural network.

Generalized Zero-Shot Learning Knowledge Distillation +2

Paper
Code

Improving Visual Grounding by Encouraging Consistent Gradient-based Explanations

1 code implementation • CVPR 2023 • Ziyan Yang, Kushal Kafle, Franck Dernoncourt, Vicente Ordonez

We propose a margin-based loss for tuning joint vision-language models so that their gradient-based explanations are consistent with region-level annotations provided by humans for relatively smaller grounding datasets.

Language Modelling Referring Expression +2

Paper
Code

Towards Understanding Gender-Seniority Compound Bias in Natural Language Generation

1 code implementation • LREC 2022 • Samhita Honnavalli, Aesha Parekh, Lily Ou, Sophie Groenwold, Sharon Levy, Vicente Ordonez, William Yang Wang

Our results show that GPT-2 amplifies bias by considering women as junior and men as senior more often than the ground truth in both domains.

Text Generation

Paper
Code

SimVQA: Exploring Simulated Environments for Visual Question Answering

no code implementations • CVPR 2022 • Paola Cascante-Bonilla, Hui Wu, Letao Wang, Rogerio Feris, Vicente Ordonez

By exploiting 3D and physics simulation platforms, we provide a pipeline to generate synthetic data to expand and replace type-specific questions and answers without risking the exposure of sensitive or personal data that might be present in real images.

Data Augmentation Question Answering +1

Paper
Add Code

Repairing Group-Level Errors for DNNs Using Weighted Regularization

1 code implementation • 24 Mar 2022 • Ziyuan Zhong, Yuchi Tian, Conor J. Sweeney, Vicente Ordonez, Baishakhi Ray

In particular, it can repair confusion error and bias error of DNN models for both single-label and multi-label image classifications.

Paper
Code

CLIP-Lite: Information Efficient Visual Representation Learning with Language Supervision

1 code implementation • 14 Dec 2021 • Aman Shrivastava, Ramprasaath R. Selvaraju, Nikhil Naik, Vicente Ordonez

We propose CLIP-Lite, an information efficient method for visual representation learning by feature alignment with textual annotations.

Contrastive Learning Representation Learning +5

Paper
Code

Estimating and Maximizing Mutual Information for Knowledge Distillation

no code implementations • 29 Oct 2021 • Aman Shrivastava, Yanjun Qi, Vicente Ordonez

Our empirical results show that MIMKD outperforms competing approaches across a wide range of student-teacher pairs with different capacities, with different architectures, and when student networks are with extremely low capacity.

Knowledge Distillation

Paper
Add Code

Evolving Image Compositions for Feature Representation Learning

no code implementations • 16 Jun 2021 • Paola Cascante-Bonilla, Arshdeep Sekhon, Yanjun Qi, Vicente Ordonez

This paper proposes PatchMix, a data augmentation method that creates new samples by composing patches from pairs of images in a grid-like pattern.

Data Augmentation Representation Learning +1

Paper
Add Code

Instance-level Image Retrieval using Reranking Transformers

1 code implementation • ICCV 2021 • Fuwen Tan, Jiangbo Yuan, Vicente Ordonez

Instance-level image retrieval is the task of searching in a large database for images that match an object in a query image.

Ranked #3 on Image Retrieval on RParis (Medium)

Image Retrieval Retrieval

119

Paper
Code

Chair Segments: A Compact Benchmark for the Study of Object Segmentation

1 code implementation • 2 Dec 2020 • Leticia Pinto-Alva, Ian K. Torres, Rosangel Garcia, Ziyan Yang, Vicente Ordonez

We aim for ChairSegments to be the equivalent of the CIFAR-10 dataset but for quickly designing and iterating over novel model architectures for segmentation.

Image Classification Object Discovery +2

Paper
Code

General Multi-label Image Classification with Transformers

2 code implementations • CVPR 2021 • Jack Lanchantin, Tianlu Wang, Vicente Ordonez, Yanjun Qi

Multi-label image classification is the task of predicting a set of labels corresponding to objects, attributes or other entities present in an image.

Classification General Classification +1

226

Paper
Code

Using Visual Feature Space as a Pivot Across Languages

1 code implementation • Findings of the Association for Computational Linguistics 2020 • Ziyan Yang, Leticia Pinto-Alva, Franck Dernoncourt, Vicente Ordonez

Our work aims to leverage visual feature space to pass information across languages.

Machine Translation Translation

Paper
Code

Visual News: Benchmark and Challenges in News Image Captioning

1 code implementation • EMNLP 2021 • Fuxiao Liu, Yinghan Wang, Tianlu Wang, Vicente Ordonez

We propose Visual News Captioner, an entity-aware model for the task of news image captioning.

Image Captioning

Paper
Code

Black-box Explanation of Object Detectors via Saliency Maps

2 code implementations • CVPR 2021 • Vitali Petsiuk, Rajiv Jain, Varun Manjunatha, Vlad I. Morariu, Ashutosh Mehra, Vicente Ordonez, Kate Saenko

We propose D-RISE, a method for generating visual explanations for the predictions of object detectors.

Object object-detection +1

Paper
Code

Double-Hard Debias: Tailoring Word Embeddings for Gender Bias Mitigation

1 code implementation • ACL 2020 • Tianlu Wang, Xi Victoria Lin, Nazneen Fatema Rajani, Bryan McCann, Vicente Ordonez, Caiming Xiong

Word embeddings derived from human-generated corpora inherit strong gender bias which can be further amplified by downstream models.

Word Embeddings

Paper
Code

Curriculum Labeling: Revisiting Pseudo-Labeling for Semi-Supervised Learning

1 code implementation • 16 Jan 2020 • Paola Cascante-Bonilla, Fuwen Tan, Yanjun Qi, Vicente Ordonez

Pseudo-labeling works by applying pseudo-labels to samples in the unlabeled set by using a model trained on the combination of the labeled samples and any previously pseudo-labeled samples, and iteratively repeating this process in a self-training cycle.

Image Classification

131

Paper
Code

MEDIRL: Predicting the Visual Attention of Drivers via Maximum Entropy Deep Inverse Reinforcement Learning

2 code implementations • ICCV 2021 • Sonia Baee, Erfan Pakdamanian, Inki Kim, Lu Feng, Vicente Ordonez, Laura Barnes

Inspired by human visual attention, we propose a novel inverse reinforcement learning formulation using Maximum Entropy Deep Inverse Reinforcement Learning (MEDIRL) for predicting the visual attention of drivers in accident-prone situations.

Autonomous Vehicles reinforcement-learning +1

Paper
Code

Drill-down: Interactive Retrieval of Complex Scenes using Natural Language Queries

1 code implementation • NeurIPS 2019 • Fuwen Tan, Paola Cascante-Bonilla, Xiaoxiao Guo, Hui Wu, Song Feng, Vicente Ordonez

We show that using multiple rounds of natural language queries as input can be surprisingly effective to find arbitrarily specific images of complex scenes.

Image Retrieval Natural Language Queries +1

Paper
Code

Moviescope: Large-scale Analysis of Movies using Multiple Modalities

2 code implementations • 8 Aug 2019 • Paola Cascante-Bonilla, Kalpathy Sitaraman, Mengjia Luo, Vicente Ordonez

Film media is a rich form of artistic expression.

Paper
Code

Testing DNN Image Classifiers for Confusion & Bias Errors

1 code implementation • 20 May 2019 • Yuchi Tian, Ziyuan Zhong, Vicente Ordonez, Gail Kaiser, Baishakhi Ray

We found that many of the reported erroneous cases in popular DNN image classifiers occur because the trained models confuse one class with another or show biases towards some classes over others.

Avg DNN Testing +2

Paper
Code

Gender Bias in Contextualized Word Embeddings

2 code implementations • NAACL 2019 • Jieyu Zhao, Tianlu Wang, Mark Yatskar, Ryan Cotterell, Vicente Ordonez, Kai-Wei Chang

In this paper, we quantify, analyze and mitigate gender bias exhibited in ELMo's contextualized word vectors.

Word Embeddings

Paper
Code

Chat-crowd: A Dialog-based Platform for Visual Layout Composition

no code implementations • NAACL 2019 • Paola Cascante-Bonilla, Xuwang Yin, Vicente Ordonez, Song Feng

In this paper we introduce Chat-crowd, an interactive environment for visual layout composition via conversational interactions.

Goal-Oriented Dialog

Paper
Add Code

Balanced Datasets Are Not Enough: Estimating and Mitigating Gender Bias in Deep Image Representations

2 code implementations • ICCV 2019 • Tianlu Wang, Jieyu Zhao, Mark Yatskar, Kai-Wei Chang, Vicente Ordonez

In this work, we present a framework to measure and mitigate intrinsic biases with respect to protected variables --such as gender-- in visual recognition tasks.

Temporal Action Localization

Paper
Code

Text2Scene: Generating Compositional Scenes from Textual Descriptions

3 code implementations • CVPR 2019 • Fuwen Tan, Song Feng, Vicente Ordonez

In this paper, we propose Text2Scene, a model that generates various forms of compositional scene representations from natural language descriptions.

115

Paper
Code

Deep Feature Aggregation and Image Re-ranking with Heat Diffusion for Image Retrieval

1 code implementation • 22 May 2018 • Shanmin Pang, Jin Ma, Jianru Xue, Jihua Zhu, Vicente Ordonez

We show that by considering each deep feature as a heat source, our unsupervised aggregation method is able to avoid over-representation of \emph{bursty} features.

Image Retrieval Re-Ranking +1

Paper
Code

Gender Bias in Coreference Resolution: Evaluation and Debiasing Methods

3 code implementations • NAACL 2018 • Jieyu Zhao, Tianlu Wang, Mark Yatskar, Vicente Ordonez, Kai-Wei Chang

We introduce a new benchmark, WinoBias, for coreference resolution focused on gender bias.

coreference-resolution Data Augmentation

Paper
Code

Feedback-prop: Convolutional Neural Network Inference under Partial Evidence

1 code implementation • CVPR 2018 • Tianlu Wang, Kota Yamaguchi, Vicente Ordonez

We propose an inference procedure for deep convolutional neural networks (CNNs) when partial evidence is available.

Paper
Code

Men Also Like Shopping: Reducing Gender Bias Amplification using Corpus-level Constraints

3 code implementations • EMNLP 2017 • Jieyu Zhao, Tianlu Wang, Mark Yatskar, Vicente Ordonez, Kai-Wei Chang

Language is increasingly being used to define rich visual recognition problems with supporting image collections sourced from the web.

General Classification Semantic Role Labeling +1

Paper
Code

OBJ2TEXT: Generating Visually Descriptive Language from Object Layouts

no code implementations • EMNLP 2017 • Xuwang Yin, Vicente Ordonez

Generating captions for images is a task that has recently received considerable attention.

Caption Generation Descriptive +3

Paper
Add Code

Where and Who? Automatic Semantic-Aware Person Composition

no code implementations • 4 Jun 2017 • Fuwen Tan, Crispin Bernier, Benjamin Cohen, Vicente Ordonez, Connelly Barnes

Image compositing is a method used to generate realistic yet fake imagery by inserting contents from one image to another.

Paper
Add Code

Commonly Uncommon: Semantic Sparsity in Situation Recognition

2 code implementations • CVPR 2017 • Mark Yatskar, Vicente Ordonez, Luke Zettlemoyer, Ali Farhadi

Semantic sparsity is a common challenge in structured visual classification problems; when the output space is complex, the vast majority of the possible predictions are rarely, if ever, seen in the training set.

Ranked #11 on Situation Recognition on imSitu