Search Results for author: Cong Wei

Found 8 papers, 3 papers with code

LaSagnA: Language-based Segmentation Assistant for Complex Queries

2 code implementations12 Apr 2024 Cong Wei, Haoxian Tan, Yujie Zhong, Yujiu Yang, Lin Ma

Recent advancements have empowered Large Language Models for Vision (vLLMs) to generate detailed perceptual outcomes, including bounding boxes and masks.

Segmentation Semantic Segmentation

AnyV2V: A Plug-and-Play Framework For Any Video-to-Video Editing Tasks

no code implementations21 Mar 2024 Max Ku, Cong Wei, Weiming Ren, Harry Yang, Wenhu Chen

In the second stage, AnyV2V can plug in any existing image-to-video models to perform DDIM inversion and intermediate feature injection to maintain the appearance and motion consistency with the source video.

Image to Video Generation Style Transfer +1

ConsistI2V: Enhancing Visual Consistency for Image-to-Video Generation

no code implementations6 Feb 2024 Weiming Ren, Harry Yang, Ge Zhang, Cong Wei, Xinrun Du, Stephen Huang, Wenhu Chen

To verify the effectiveness of our method, we propose I2V-Bench, a comprehensive evaluation benchmark for I2V generation.

Image to Video Generation

VIEScore: Towards Explainable Metrics for Conditional Image Synthesis Evaluation

no code implementations22 Dec 2023 Max Ku, Dongfu Jiang, Cong Wei, Xiang Yue, Wenhu Chen

We evaluate VIESCORE on seven prominent tasks in conditional image tasks and found: (1) VIESCORE (GPT4-v) achieves a high Spearman correlation of 0. 3 with human evaluations, while the human-to-human correlation is 0. 45.

Conditional Image Generation General Knowledge

UniIR: Training and Benchmarking Universal Multimodal Information Retrievers

no code implementations28 Nov 2023 Cong Wei, Yang Chen, Haonan Chen, Hexiang Hu, Ge Zhang, Jie Fu, Alan Ritter, Wenhu Chen

Existing information retrieval (IR) models often assume a homogeneous format, limiting their applicability to diverse user needs, such as searching for images with text descriptions, searching for a news article with a headline image, or finding a similar photo with a query image.

Benchmarking Information Retrieval +2

DreamEdit: Subject-driven Image Editing

no code implementations22 Jun 2023 Tianle Li, Max Ku, Cong Wei, Wenhu Chen

In this work, we aspire to fill the void and propose two novel subject-driven sub-tasks, i. e., Subject Replacement and Subject Addition.

Image Generation Position

Sparsifiner: Learning Sparse Instance-Dependent Attention for Efficient Vision Transformers

1 code implementation CVPR 2023 Cong Wei, Brendan Duke, Ruowei Jiang, Parham Aarabi, Graham W. Taylor, Florian Shkurti

Equipped with the learned unstructured attention pattern, sparse attention ViT (Sparsifiner) produces a superior Pareto-optimal trade-off between FLOPs and top-1 accuracy on ImageNet compared to token sparsity.

Cannot find the paper you are looking for? You can Submit a new open access paper.