Search Results for author: Mengmi Zhang

Found 28 papers, 20 papers with code

Make Me Happier: Evoking Emotions Through Image Diffusion Models

no code implementations • 13 Mar 2024 • Qing Lin, Jingfeng Zhang, Yew Soon Ong, Mengmi Zhang

For the first time, we present a novel challenge of emotion-evoked image generation, aiming to synthesize images that evoke target emotions while retaining the semantics and structures of the original scenes.

Image Generation

Paper
Add Code

TTA-Nav: Test-time Adaptive Reconstruction for Point-Goal Navigation under Visual Corruptions

1 code implementation • 4 Mar 2024 • Maytus Piriyajitakonkij, Mingfei Sun, Mengmi Zhang, Wei Pan

Our "plug-and-play" method incorporates a top-down decoder to a pre-trained navigation model.

Robot Navigation Test-time Adaptation +1

Paper
Code

Adaptive Visual Scene Understanding: Incremental Scene Graph Generation

1 code implementation • 2 Oct 2023 • Naitik Khandelwal, Xiao Liu, Mengmi Zhang

To address the lack of continual learning methodologies in SGG, we introduce the comprehensive Continual ScenE Graph Generation (CSEGG) dataset along with 3 learning scenarios and 8 evaluation metrics.

Benchmarking Continual Learning +5

Paper
Code

Decoding the Enigma: Benchmarking Humans and AIs on the Many Facets of Working Memory

1 code implementation • NeurIPS 2023 • Ankur Sikarwar, Mengmi Zhang

Here, we introduce a comprehensive Working Memory (WorM) benchmark dataset for this purpose.

Benchmarking Decision Making

Paper
Code

Integrating Curricula with Replays: Its Effects on Continual Learning

1 code implementation • 8 Jul 2023 • Ren Jie Tee, Mengmi Zhang

Our study takes initial steps in examining the impact of integrating curricula with replay methods on continual learning in three specific aspects: the interleaved frequency of replayed exemplars with training data, the sequence in which exemplars are replayed, and the strategy for selecting exemplars into the replay buffer.

Continual Learning Transfer Learning

Paper
Code

Training-free Object Counting with Prompts

1 code implementation • 30 Jun 2023 • Zenglin Shi, Ying Sun, Mengmi Zhang

However, the vanilla mask generation method of SAM lacks class-specific information in the masks, resulting in inferior counting accuracy.

Ranked #1 on Object Counting on PASCAL VOC

Object Segmentation +2

Paper
Code

Object-centric Learning with Cyclic Walks between Parts and Whole

1 code implementation • NeurIPS 2023 • Ziyu Wang, Mike Zheng Shou, Mengmi Zhang

To capture compositional entities of the scene, we proposed cyclic walks between perceptual features extracted from vision transformers and object entities.

Object

Paper
Code

Learning to Learn: How to Continuously Teach Humans and Machines

1 code implementation • ICCV 2023 • Parantak Singh, You Li, Ankur Sikarwar, Weixian Lei, Daniel Gao, Morgan Bruce Talbot, Ying Sun, Mike Zheng Shou, Gabriel Kreiman, Mengmi Zhang

For example, when we learn mathematics at school, we build upon our knowledge of addition to learn multiplication.

Class Incremental Learning Image Classification +3

Paper
Code

Efficient Zero-shot Visual Search via Target and Context-aware Transformer

no code implementations • 24 Nov 2022 • Zhiwei Ding, Xuezhe Ren, Erwan David, Melissa Vo, Gabriel Kreiman, Mengmi Zhang

Target modulation is computed as patch-wise local relevance between the target and search images, whereas contextual modulation is applied in a global fashion.

Paper
Add Code

Reason from Context with Self-supervised Learning

no code implementations • 23 Nov 2022 • Xiao Liu, Ankur Sikarwar, Gabriel Kreiman, Zenglin Shi, Mengmi Zhang

To better accommodate the object-centric nature of current downstream tasks such as object recognition and detection, various methods have been proposed to suppress contextual biases or disentangle objects from contexts.

Object Object Recognition +2

Paper
Add Code

Human or Machine? Turing Tests for Vision and Language

no code implementations • 23 Nov 2022 • Mengmi Zhang, Giorgia Dellaferrera, Ankur Sikarwar, Marcelo Armendariz, Noga Mudrik, Prachi Agrawal, Spandan Madan, Andrei Barbu, Haochen Yang, Tanishq Kumar, Meghna Sadwani, Stella Dellaferrera, Michele Pizzochero, Hanspeter Pfister, Gabriel Kreiman

To address this question, we turn to the Turing test and systematically benchmark current AIs in their abilities to imitate humans.

Paper
Add Code

The Role of Robust Generalization in Continual Learning: Better Transfer and Less Forgetting

no code implementations • 21 Nov 2022 • Zenglin Shi, Ying Sun, Joo Hwee Lim, Mengmi Zhang

To the best of our knowledge, no existing technique can accomplish all of these objectives simultaneously.

Continual Learning Transfer Learning

Paper
Add Code

Symbolic Replay: Scene Graph as Prompt for Continual Learning on VQA Task

1 code implementation • 24 Aug 2022 • Stan Weixian Lei, Difei Gao, Jay Zhangjie Wu, Yuxuan Wang, Wei Liu, Mengmi Zhang, Mike Zheng Shou

However, CL on VQA involves not only the expansion of label sets (new Answer sets).

Continual Learning Question Answering +1

Paper
Code

Improving generalization by mimicking the human visual diet

1 code implementation • 15 Jun 2022 • Spandan Madan, You Li, Mengmi Zhang, Hanspeter Pfister, Gabriel Kreiman

We present a new perspective on bridging the generalization gap between biological and computer vision -- mimicking the human visual diet.

Domain Generalization

Paper
Code

Label-Efficient Online Continual Object Detection in Streaming Video

1 code implementation • ICCV 2023 • Jay Zhangjie Wu, David Junhao Zhang, Wynne Hsu, Mengmi Zhang, Mike Zheng Shou

Remarkably, with only 25% annotated video frames, our method still outperforms the base CL learners, which are trained with 100% annotations on all video frames.

Continual Learning Hippocampus +3

Paper
Code

Visual Search Asymmetry: Deep Nets and Humans Share Similar Inherent Biases

1 code implementation • NeurIPS 2021 • Shashi Kant Gupta, Mengmi Zhang, Chia-Chien Wu, Jeremy M. Wolfe, Gabriel Kreiman

To elucidate the mechanisms responsible for asymmetry in visual search, we propose a computational model that takes a target and a search image as inputs and produces a sequence of eye movements until the target is found.

Paper
Code

When Pigs Fly: Contextual Reasoning in Synthetic and Natural Scenes

1 code implementation • ICCV 2021 • Philipp Bomatter, Mengmi Zhang, Dimitar Karev, Spandan Madan, Claire Tseng, Gabriel Kreiman

Our model captures useful information for contextual reasoning, enabling human-level performance and better robustness in out-of-context conditions compared to baseline models across OCD and other out-of-context datasets.

Object

Paper
Code

Tuned Compositional Feature Replays for Efficient Stream Learning

1 code implementation • 6 Apr 2021 • Morgan B. Talbot, Rushikesh Zawar, Rohil Badkundri, Mengmi Zhang, Gabriel Kreiman

To address the limited number of existing online stream learning datasets, we introduce 2 new benchmarks by adapting existing datasets for stream learning.

Continual Learning Image Classification +2

Paper
Code

Look Twice: A Generalist Computational Model Predicts Return Fixations across Tasks and Species

1 code implementation • 5 Jan 2021 • Mengmi Zhang, Marcelo Armendariz, Will Xiao, Olivia Rose, Katarina Bendtz, Margaret Livingstone, Carlos Ponce, Gabriel Kreiman

Primates constantly explore their surroundings via saccadic eye movements that bring different parts of an image into high resolution.

Object Recognition

Paper
Code

What am I Searching for: Zero-shot Target Identity Inference in Visual Search

1 code implementation • 25 May 2020 • Mengmi Zhang, Gabriel Kreiman

Using those error fixations, we developed a model (InferNet) to infer what the target was.

Paper
Code

Putting visual object recognition in context

1 code implementation • CVPR 2020 • Mengmi Zhang, Claire Tseng, Gabriel Kreiman

To model the role of contextual information in visual recognition, we systematically investigated ten critical properties of where, when, and how context modulates recognition, including the amount of context, context and object resolution, geometrical structure of context, context congruence, and temporal dynamics of contextual modulation.

Object Object Recognition

Paper
Code

Prototype Recalls for Continual Learning

no code implementations • 25 Sep 2019 • Mengmi Zhang, Tao Wang, Joo Hwee Lim, Jiashi Feng

Without tampering with the performance on initial tasks, our method learns novel concepts given a few training examples of each class in new tasks.

Continual Learning Metric Learning +1

Paper
Add Code

Variational Prototype Replays for Continual Learning

1 code implementation • 23 May 2019 • Mengmi Zhang, Tao Wang, Joo Hwee Lim, Gabriel Kreiman, Jiashi Feng

In each classification task, our method learns a set of variational prototypes with their means and variances, where embedding of the samples from the same class can be represented in a prototypical distribution and class-representative prototypes are separated apart.

Continual Learning General Classification +2

Paper
Code

Lift-the-flap: what, where and when for context reasoning

no code implementations • 1 Feb 2019 • Mengmi Zhang, Claire Tseng, Karla Montejo, Joseph Kwon, Gabriel Kreiman

Context reasoning is critical in a wide variety of applications where current inputs need to be interpreted in the light of previous experience and knowledge.

General Classification Object Recognition +1

Paper
Add Code

What am I Searching for: Zero-shot Target Identity Inference in Visual Search

1 code implementation • 31 Jul 2018 • Mengmi Zhang, Gabriel Kreiman

Using those error fixations, we developed a model (InferNet) to infer what the target was.

Paper
Code

Egocentric Spatial Memory

1 code implementation • 31 Jul 2018 • Mengmi Zhang, Keng Teck Ma, Shih-Cheng Yen, Joo Hwee Lim, Qi Zhao, Jiashi Feng

Egocentric spatial memory (ESM) defines a memory system with encoding, storing, recognizing and recalling the spatial information about the environment from an egocentric perspective.

Feature Engineering

Paper
Code

Egocentric Spatial Memory Network

no code implementations • ICLR 2018 • Mengmi Zhang, Keng Teck Ma, Joo Hwee Lim, Shih-Cheng Yen, Qi Zhao, Jiashi Feng

During the exploration, our proposed ESM network model updates belief of the global map based on local observations using a recurrent neural network.

Navigate Simultaneous Localization and Mapping

Paper
Add Code

Deep Future Gaze: Gaze Anticipation on Egocentric Videos Using Adversarial Networks

1 code implementation • CVPR 2017 • Mengmi Zhang, Keng Teck Ma, Joo Hwee Lim, Qi Zhao, Jiashi Feng

Through competition with discriminator, the generator progressively improves quality of the future frames and thus anticipates future gaze better.

Gaze Prediction

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.