Search Results for author: Arushi Goel

Found 15 papers, 5 papers with code

Audio Dialogues: Dialogues dataset for audio and music understanding

no code implementations • 11 Apr 2024 • Arushi Goel, Zhifeng Kong, Rafael Valle, Bryan Catanzaro

Existing datasets for audio understanding primarily focus on single-turn interactions (i. e. audio captioning, audio question answering) for describing audio in natural language, thus limiting understanding audio via interactive dialogue.

Audio captioning Audio Question Answering +3

Paper
Add Code

Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities

no code implementations • 2 Feb 2024 • Zhifeng Kong, Arushi Goel, Rohan Badlani, Wei Ping, Rafael Valle, Bryan Catanzaro

Augmenting large language models (LLMs) to understand audio -- including non-speech sounds and non-verbal speech -- is critically important for diverse real-world applications of LLMs.

Few-Shot Learning In-Context Learning +2

Paper
Add Code

Language-guided Robot Grasping: CLIP-based Referring Grasp Synthesis in Clutter

1 code implementation • 9 Nov 2023 • Georgios Tziafas, Yucheng Xu, Arushi Goel, Mohammadreza Kasaei, Zhibin Li, Hamidreza Kasaei

To address these limitations, we develop a challenging benchmark based on cluttered indoor scenes from OCID dataset, for which we generate referring expressions and connect them with 4-DoF grasp poses.

Object Visual Grounding

Paper
Code

Semi-supervised multimodal coreference resolution in image narrations

1 code implementation • 20 Oct 2023 • Arushi Goel, Basura Fernando, Frank Keller, Hakan Bilen

In this paper, we study multimodal coreference resolution, specifically where a longer descriptive text, i. e., a narration is paired with an image.

coreference-resolution Descriptive

Paper
Code

Encyclopedic VQA: Visual questions about detailed properties of fine-grained categories

1 code implementation • ICCV 2023 • Thomas Mensink, Jasper Uijlings, Lluis Castrejon, Arushi Goel, Felipe Cadar, Howard Zhou, Fei Sha, André Araujo, Vittorio Ferrari

Empirically, we show that our dataset poses a hard challenge for large vision+language models as they perform poorly on our dataset: PaLI [14] is state-of-the-art on OK-VQA [37], yet it only achieves 13. 0% accuracy on our dataset.

Question Answering Retrieval +1

32,835

Paper
Code

Controllable Video Generation by Learning the Underlying Dynamical System with Neural ODE

no code implementations • 9 Mar 2023 • Yucheng Xu, Li Nanbo, Arushi Goel, Zijian Guo, Zonghai Yao, Hamidreza Kasaei, Mohammadreze Kasaei, Zhibin Li

Videos depict the change of complex dynamical systems over time in the form of discrete image sequences.

Video Generation

Paper
Add Code

Who are you referring to? Coreference resolution in image narrations

no code implementations • ICCV 2023 • Arushi Goel, Basura Fernando, Frank Keller, Hakan Bilen

Coreference resolution aims to identify words and phrases which refer to same entity in a text, a core task in natural language processing.

coreference-resolution

Paper
Add Code

WiCV 2022: The Tenth Women In Computer Vision Workshop

no code implementations • 24 Aug 2022 • Doris Antensteiner, Silvia Bucci, Arushi Goel, Marah Halawa, Niveditha Kalavakonda, Tejaswi Kasarla, Miaomiao Liu, Nermin Samet, Ivaxi Sheth

In this paper, we present the details of Women in Computer Vision Workshop - WiCV 2022, organized alongside the hybrid CVPR 2022 in New Orleans, Louisiana.

Paper
Add Code

WiCV 2021: The Eighth Women In Computer Vision Workshop

no code implementations • 11 Mar 2022 • Arushi Goel, Niveditha Kalavakonda, Nour Karessli, Tejaswi Kasarla, Kathryn Leonard, Boyi Li, Nermin Samet and, Ghada Zamzmi

In this paper, we present the details of Women in Computer Vision Workshop - WiCV 2021, organized alongside the virtual CVPR 2021.

Paper
Add Code

PARS: Pseudo-Label Aware Robust Sample Selection for Learning with Noisy Labels

no code implementations • 26 Jan 2022 • Arushi Goel, Yunlong Jiao, Jordan Massiah

In this paper, we propose PARS: Pseudo-Label Aware Robust Sample Selection, a hybrid approach that combines the best from all three worlds in a joint-training framework to achieve robustness to noisy labels.

Learning with noisy labels Pseudo Label

Paper
Add Code

Not All Relations are Equal: Mining Informative Labels for Scene Graph Generation

no code implementations • CVPR 2022 • Arushi Goel, Basura Fernando, Frank Keller, Hakan Bilen

Scene graph generation (SGG) aims to capture a wide variety of interactions between pairs of objects, which is essential for full scene understanding.

Graph Generation Informativeness +2

Paper
Add Code

Injecting Prior Knowledge into Image Caption Generation

no code implementations • 22 Nov 2019 • Arushi Goel, Basura Fernando, Thanh-Son Nguyen, Hakan Bilen

Automatically generating natural language descriptions from an image is a challenging problem in artificial intelligence that requires a good understanding of the visual and textual signals and the correlations between them.

Caption Generation Image Captioning

Paper
Add Code

Cross-Domain Image Classification through Neural-Style Transfer Data Augmentation

1 code implementation • 12 Oct 2019 • Yijie Xu, Arushi Goel

In particular, the lack of sufficient amounts of domain-specific data can reduce the accuracy of a classifier.

Classification Data Augmentation +3

Paper
Code

An End-to-End Network for Generating Social Relationship Graphs

no code implementations • CVPR 2019 • Arushi Goel, Keng Teck Ma, Cheston Tan

Inferring the social context in a given visual scene not only involves recognizing objects, but also demands a more in-depth understanding of the relationships and attributes of the people involved.

Attribute Graph Generation +1

Paper
Add Code

A Multimodal LSTM for Predicting Listener Empathic Responses Over Time

1 code implementation • 12 Dec 2018 • Zhi-Xuan Tan, Arushi Goel, Thanh-Son Nguyen, Desmond C. Ong

People naturally understand the emotions of-and often also empathize with-those around them.

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.