Search Results for author: Zanlin Ni

Found 7 papers, 6 papers with code

LLaVA-UHD: an LMM Perceiving Any Aspect Ratio and High-Resolution Images

2 code implementations • 18 Mar 2024 • Ruyi Xu, Yuan YAO, Zonghao Guo, Junbo Cui, Zanlin Ni, Chunjiang Ge, Tat-Seng Chua, Zhiyuan Liu, Maosong Sun, Gao Huang

To address the challenges, we present LLaVA-UHD, a large multimodal model that can efficiently perceive images in any aspect ratio and high resolution.

1,904

Paper
Code

Smooth Diffusion: Crafting Smooth Latent Spaces in Diffusion Models

1 code implementation • 7 Dec 2023 • Jiayi Guo, Xingqian Xu, Yifan Pu, Zanlin Ni, Chaofei Wang, Manushree Vasu, Shiji Song, Gao Huang, Humphrey Shi

Specifically, we introduce Step-wise Variation Regularization to enforce the proportion between the variations of an arbitrary input latent and that of the output image is a constant at any diffusion training step.

268

Paper
Code

Deep Incubation: Training Large Models by Divide-and-Conquering

3 code implementations • ICCV 2023 • Zanlin Ni, Yulin Wang, Jiangwei Yu, Haojun Jiang, Yue Cao, Gao Huang

In this paper, we present Deep Incubation, a novel approach that enables the efficient and effective training of large models by dividing them into smaller sub-modules that can be trained separately and assembled seamlessly.

Image Segmentation object-detection +2

255

Paper
Code

Cross-Modal Adapter for Text-Video Retrieval

1 code implementation • 17 Nov 2022 • Haojun Jiang, Jianke Zhang, Rui Huang, Chunjiang Ge, Zanlin Ni, Jiwen Lu, Jie zhou, Shiji Song, Gao Huang

However, as pre-trained models are scaling up, fully fine-tuning them on text-video retrieval datasets has a high risk of overfitting.

Retrieval Video Retrieval

Paper
Code

Revisiting Locally Supervised Learning: an Alternative to End-to-end Training

1 code implementation • 26 Jan 2021 • Yulin Wang, Zanlin Ni, Shiji Song, Le Yang, Gao Huang

Due to the need to store the intermediate activations for back-propagation, end-to-end (E2E) training of deep networks usually suffers from high GPUs memory footprint.

Paper
Code

Revisiting Locally Supervised Training of Deep Neural Networks

no code implementations • ICLR 2021 • Yulin Wang, Zanlin Ni, Shiji Song, Le Yang, Gao Huang

As InfoPro loss is difficult to compute in its original form, we derive a feasible upper bound as a surrogate optimization objective, yielding a simple but effective algorithm.

Paper
Add Code

Uncertainty-aware Score Distribution Learning for Action Quality Assessment

1 code implementation • CVPR 2020 • Yansong Tang, Zanlin Ni, Jiahuan Zhou, Danyang Zhang, Jiwen Lu, Ying Wu, Jie zhou

Assessing action quality from videos has attracted growing attention in recent years.

Ranked #4 on Action Quality Assessment on AQA-7

Action Quality Assessment

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.