Search Results for author: zhihuan yu

Found 2 papers, 1 papers with code

EventLens: Leveraging Event-Aware Pretraining and Cross-modal Linking Enhances Visual Commonsense Reasoning

no code implementations • 22 Apr 2024 • Mingjie Ma, zhihuan yu, Yichao Ma, GuoHui Li

First, by emulating the cognitive process of human reasoning, an Event-Aware Pretraining auxiliary task is introduced to better activate LLM's global comprehension of intricate scenarios.

Visual Commonsense Reasoning

Paper
Add Code

LMD: Faster Image Reconstruction with Latent Masking Diffusion

1 code implementation • 13 Dec 2023 • Zhiyuan Ma, zhihuan yu, Jianjun Li, BoWen Zhou

Then, we combine the advantages of MAEs and DPMs to design a progressive masking diffusion model, which gradually increases the masking proportion by three different schedulers and reconstructs the latent features from simple to difficult, without sequentially performing denoising diffusion as in DPMs or using fixed high masking ratio as in MAEs, so as to alleviate the high training time-consumption predicament.

Denoising Image Reconstruction

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.