no code implementations • 15 Apr 2024 • Yiming Zhang, Zhuokai Zhao, Zhaorun Chen, Zhili Feng, Zenghui Ding, Yining Sun
Among the ever-evolving development of vision-language models, contrastive language-image pretraining (CLIP) has set new benchmarks in many downstream tasks such as zero-shot classifications by leveraging self-supervised contrastive learning on large amounts of text-image pairs.
1 code implementation • 1 Mar 2024 • Zhaorun Chen, Zhuokai Zhao, Hongyin Luo, Huaxiu Yao, Bo Li, Jiawei Zhou
While large vision-language models (LVLMs) have demonstrated impressive capabilities in interpreting multi-modal contexts, they invariably suffer from object hallucinations (OH).
no code implementations • 18 Feb 2024 • Zhaorun Chen, Zhuokai Zhao, Zhihong Zhu, Ruiqi Zhang, Xiang Li, Bhiksha Raj, Huaxiu Yao
Recent advancements in large language models (LLMs) have shown promise in multi-step reasoning tasks, yet their reliance on extensive manual labeling to provide procedural feedback remains a significant impediment.
no code implementations • 8 Feb 2024 • Zhuokai Zhao, Yibo Jiang, Yuxin Chen
Active Learning (AL) has gained prominence in integrating data-intensive machine learning (ML) models into domains with limited labeled data.
no code implementations • 7 Sep 2023 • Zhuokai Zhao, Harish Palani, Tianyi Liu, Lena Evans, Ruth Toner
Multimodal models have gained significant success in recent years.
no code implementations • 31 May 2023 • Zhuokai Zhao, Takumi Matsuzawa, William Irvine, Michael Maire, Gordon L Kindlmann
NERO evaluation is consist of a task-agnostic interactive interface and a set of visualizations, called NERO plots, which reveals the equivariance property of the model.
no code implementations • 24 May 2023 • Zhuokai Zhao, Yang Yang, Wenyu Wang, Chihuang Liu, Yu Shi, Wenjie Hu, Haotian Zhang, Shuang Yang
A key puzzle in search, ads, and recommendation is that the ranking model can only utilize a small portion of the vastly available user interaction data.