Search Results for author: Guangzhi Wang

Found 13 papers, 9 papers with code

S3Editor: A Sparse Semantic-Disentangled Self-Training Framework for Face Video Editing

no code implementations • 11 Apr 2024 • Guangzhi Wang, Tianyi Chen, Kamran Ghasedi, HsiangTao Wu, Tianyu Ding, Chris Nuesmeyer, Ilya Zharkov, Mohan Kankanhalli, Luming Liang

S3Editor is model-agnostic and compatible with various editing approaches.

Attribute Video Editing

Paper
Add Code

The Efficiency Spectrum of Large Language Models: An Algorithmic Survey

1 code implementation • 1 Dec 2023 • Tianyu Ding, Tianyi Chen, Haidong Zhu, Jiachen Jiang, Yiqi Zhong, Jinxin Zhou, Guangzhi Wang, Zhihui Zhu, Ilya Zharkov, Luming Liang

The rapid growth of Large Language Models (LLMs) has been a driving force in transforming various domains, reshaping the artificial general intelligence landscape.

Model Compression

Paper
Code

SEED-Bench-2: Benchmarking Multimodal Large Language Models

1 code implementation • 28 Nov 2023 • Bohao Li, Yuying Ge, Yixiao Ge, Guangzhi Wang, Rui Wang, Ruimao Zhang, Ying Shan

Multimodal large language models (MLLMs), building upon the foundation of powerful large language models (LLMs), have recently demonstrated exceptional capabilities in generating not only texts but also images given interleaved multimodal inputs (acting like a combination of GPT-4V and DALL-E 3).

Benchmarking Image Generation +1

240

Paper
Code

PELA: Learning Parameter-Efficient Models with Low-Rank Approximation

1 code implementation • 16 Oct 2023 • Yangyang Guo, Guangzhi Wang, Mohan Kankanhalli

This allows for direct and efficient utilization of the low-rank model for downstream fine-tuning tasks.

Paper
Code

SEED-Bench: Benchmarking Multimodal LLMs with Generative Comprehension

2 code implementations • 30 Jul 2023 • Bohao Li, Rui Wang, Guangzhi Wang, Yuying Ge, Yixiao Ge, Ying Shan

Based on powerful Large Language Models (LLMs), recent generative Multimodal Large Language Models (MLLMs) have gained prominence as a pivotal research area, exhibiting remarkable capability for both comprehension and generation.

Benchmarking Multiple-choice

358

Paper
Code

Mining Conditional Part Semantics with Occluded Extrapolation for Human-Object Interaction Detection

no code implementations • 19 Jul 2023 • Guangzhi Wang, Yangyang Guo, Mohan Kankanhalli

Human-Object Interaction Detection is a crucial aspect of human-centric scene understanding, with important applications in various domains.

Human-Object Interaction Detection Object +1

Paper
Add Code

EVD Surgical Guidance with Retro-Reflective Tool Tracking and Spatial Reconstruction using Head-Mounted Augmented Reality Device

no code implementations • 27 Jun 2023 • Haowei Li, Wenqing Yan, Du Liu, Long Qian, Yuxing Yang, Yihao Liu, Zhe Zhao, Hui Ding, Guangzhi Wang

The head surface is reconstructed using depth data for spatial registration, avoiding fixing tracking targets rigidly on the patient's skull.

Anatomy

Paper
Add Code

What Makes for Good Visual Tokenizers for Large Language Models?

1 code implementation • 20 May 2023 • Guangzhi Wang, Yixiao Ge, Xiaohan Ding, Mohan Kankanhalli, Ying Shan

In our benchmark, which is curated to evaluate MLLMs visual semantic understanding and fine-grained perception capabilities, we discussed different visual tokenizers pre-trained with dominant methods (i. e., DeiT, CLIP, MAE, DINO), and observe that: i) Fully/weakly supervised models capture more semantics than self-supervised models, but the gap is narrowed by scaling up the pre-training dataset.

Image Captioning Object Counting +2

Paper
Code

Text to Point Cloud Localization with Relation-Enhanced Transformer

no code implementations • 13 Jan 2023 • Guangzhi Wang, Hehe Fan, Mohan Kankanhalli

To overcome these two challenges, we propose a unified Relation-Enhanced Transformer (RET) to improve representation discriminability for both point cloud and natural language queries.

Natural Language Queries Relation

Paper
Add Code

Chairs Can be Stood on: Overcoming Object Bias in Human-Object Interaction Detection

1 code implementation • 6 Jul 2022 • Guangzhi Wang, Yangyang Guo, Yongkang Wong, Mohan Kankanhalli

To quantitatively study the object bias problem, we advocate a new protocol for evaluating model performance.

Human-Object Interaction Detection Object +2

Paper
Code

Distance Matters in Human-Object Interaction Detection

1 code implementation • 5 Jul 2022 • Guangzhi Wang, Yangyang Guo, Yongkang Wong, Mohan Kankanhalli

2) Insufficient number of distant interactions in benchmark datasets results in under-fitting on these instances.

Human-Object Interaction Detection Object +1

Paper
Code

Relation-aware Compositional Zero-shot Learning for Attribute-Object Pair Recognition

1 code implementation • 10 Aug 2021 • Ziwei Xu, Guangzhi Wang, Yongkang Wong, Mohan Kankanhalli

The concept module generates semantically meaningful features for primitive concepts, whereas the visual module extracts visual features for attributes and objects from input images.

Attribute Blocking +2

Paper
Code

Multi-source Distilling Domain Adaptation

1 code implementation • 22 Nov 2019 • Sicheng Zhao, Guangzhi Wang, Shanghang Zhang, Yang Gu, Yaxian Li, Zhichao Song, Pengfei Xu, Runbo Hu, Hua Chai, Kurt Keutzer

Deep neural networks suffer from performance decay when there is domain shift between the labeled source domain and unlabeled target domain, which motivates the research on domain adaptation (DA).

Ranked #4 on Multi-Source Unsupervised Domain Adaptation on Office-31

Domain Adaptation Multi-Source Unsupervised Domain Adaptation

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.