Search Results for author: Jinliang Zheng

Found 3 papers, 1 papers with code

GLID: Pre-training a Generalist Encoder-Decoder Vision Model

no code implementations11 Apr 2024 Jihao Liu, Jinliang Zheng, Yu Liu, Hongsheng Li

This paper proposes a GeneraLIst encoder-Decoder (GLID) pre-training method for better handling various downstream computer vision tasks.

Depth Estimation Image Segmentation +5

DecisionNCE: Embodied Multimodal Representations via Implicit Preference Learning

no code implementations28 Feb 2024 Jianxiong Li, Jinliang Zheng, Yinan Zheng, Liyuan Mao, Xiao Hu, Sijie Cheng, Haoyi Niu, Jihao Liu, Yu Liu, Jingjing Liu, Ya-Qin Zhang, Xianyuan Zhan

Multimodal pretraining has emerged as an effective strategy for the trinity of goals of representation learning in autonomous robots: 1) extracting both local and global task progression information; 2) enforcing temporal consistency of visual representation; 3) capturing trajectory-level language grounding.

Contrastive Learning Decision Making +1

MixMAE: Mixed and Masked Autoencoder for Efficient Pretraining of Hierarchical Vision Transformers

1 code implementation CVPR 2023 Jihao Liu, Xin Huang, Jinliang Zheng, Yu Liu, Hongsheng Li

In this paper, we propose Mixed and Masked AutoEncoder (MixMAE), a simple but efficient pretraining method that is applicable to various hierarchical Vision Transformers.

Image Classification Object Detection +2

Cannot find the paper you are looking for? You can Submit a new open access paper.