Search Results for author: Guangzhi Wang

Found 13 papers, 9 papers with code

The Efficiency Spectrum of Large Language Models: An Algorithmic Survey

1 code implementation1 Dec 2023 Tianyu Ding, Tianyi Chen, Haidong Zhu, Jiachen Jiang, Yiqi Zhong, Jinxin Zhou, Guangzhi Wang, Zhihui Zhu, Ilya Zharkov, Luming Liang

The rapid growth of Large Language Models (LLMs) has been a driving force in transforming various domains, reshaping the artificial general intelligence landscape.

Model Compression

SEED-Bench-2: Benchmarking Multimodal Large Language Models

1 code implementation28 Nov 2023 Bohao Li, Yuying Ge, Yixiao Ge, Guangzhi Wang, Rui Wang, Ruimao Zhang, Ying Shan

Multimodal large language models (MLLMs), building upon the foundation of powerful large language models (LLMs), have recently demonstrated exceptional capabilities in generating not only texts but also images given interleaved multimodal inputs (acting like a combination of GPT-4V and DALL-E 3).

Benchmarking Image Generation +1

PELA: Learning Parameter-Efficient Models with Low-Rank Approximation

1 code implementation16 Oct 2023 Yangyang Guo, Guangzhi Wang, Mohan Kankanhalli

This allows for direct and efficient utilization of the low-rank model for downstream fine-tuning tasks.

SEED-Bench: Benchmarking Multimodal LLMs with Generative Comprehension

2 code implementations30 Jul 2023 Bohao Li, Rui Wang, Guangzhi Wang, Yuying Ge, Yixiao Ge, Ying Shan

Based on powerful Large Language Models (LLMs), recent generative Multimodal Large Language Models (MLLMs) have gained prominence as a pivotal research area, exhibiting remarkable capability for both comprehension and generation.

Benchmarking Multiple-choice

Mining Conditional Part Semantics with Occluded Extrapolation for Human-Object Interaction Detection

no code implementations19 Jul 2023 Guangzhi Wang, Yangyang Guo, Mohan Kankanhalli

Human-Object Interaction Detection is a crucial aspect of human-centric scene understanding, with important applications in various domains.

Human-Object Interaction Detection Object +1

EVD Surgical Guidance with Retro-Reflective Tool Tracking and Spatial Reconstruction using Head-Mounted Augmented Reality Device

no code implementations27 Jun 2023 Haowei Li, Wenqing Yan, Du Liu, Long Qian, Yuxing Yang, Yihao Liu, Zhe Zhao, Hui Ding, Guangzhi Wang

The head surface is reconstructed using depth data for spatial registration, avoiding fixing tracking targets rigidly on the patient's skull.

Anatomy

What Makes for Good Visual Tokenizers for Large Language Models?

1 code implementation20 May 2023 Guangzhi Wang, Yixiao Ge, Xiaohan Ding, Mohan Kankanhalli, Ying Shan

In our benchmark, which is curated to evaluate MLLMs visual semantic understanding and fine-grained perception capabilities, we discussed different visual tokenizers pre-trained with dominant methods (i. e., DeiT, CLIP, MAE, DINO), and observe that: i) Fully/weakly supervised models capture more semantics than self-supervised models, but the gap is narrowed by scaling up the pre-training dataset.

Image Captioning Object Counting +2

Text to Point Cloud Localization with Relation-Enhanced Transformer

no code implementations13 Jan 2023 Guangzhi Wang, Hehe Fan, Mohan Kankanhalli

To overcome these two challenges, we propose a unified Relation-Enhanced Transformer (RET) to improve representation discriminability for both point cloud and natural language queries.

Natural Language Queries Relation

Distance Matters in Human-Object Interaction Detection

1 code implementation5 Jul 2022 Guangzhi Wang, Yangyang Guo, Yongkang Wong, Mohan Kankanhalli

2) Insufficient number of distant interactions in benchmark datasets results in under-fitting on these instances.

Human-Object Interaction Detection Object +1

Relation-aware Compositional Zero-shot Learning for Attribute-Object Pair Recognition

1 code implementation10 Aug 2021 Ziwei Xu, Guangzhi Wang, Yongkang Wong, Mohan Kankanhalli

The concept module generates semantically meaningful features for primitive concepts, whereas the visual module extracts visual features for attributes and objects from input images.

Attribute Blocking +2

Multi-source Distilling Domain Adaptation

1 code implementation22 Nov 2019 Sicheng Zhao, Guangzhi Wang, Shanghang Zhang, Yang Gu, Yaxian Li, Zhichao Song, Pengfei Xu, Runbo Hu, Hua Chai, Kurt Keutzer

Deep neural networks suffer from performance decay when there is domain shift between the labeled source domain and unlabeled target domain, which motivates the research on domain adaptation (DA).

Domain Adaptation Multi-Source Unsupervised Domain Adaptation

Cannot find the paper you are looking for? You can Submit a new open access paper.