Search Results for author: Feifei Feng

Found 12 papers, 2 papers with code

Mipha: A Comprehensive Overhaul of Multimodal Assistant with Small Language Models

1 code implementation • 10 Mar 2024 • Minjie Zhu, Yichen Zhu, Xin Liu, Ning Liu, Zhiyuan Xu, Chaomin Shen, Yaxin Peng, Zhicai Ou, Feifei Feng, Jian Tang

Multimodal Large Language Models (MLLMs) have showcased impressive skills in tasks related to visual understanding and reasoning.

Ranked #69 on Visual Question Answering on MM-Vet

Visual Question Answering

303

Paper
Code

Language-Conditioned Robotic Manipulation with Fast and Slow Thinking

no code implementations • 8 Jan 2024 • Minjie Zhu, Yichen Zhu, Jinming Li, Junjie Wen, Zhiyuan Xu, Zhengping Che, Chaomin Shen, Yaxin Peng, Dong Liu, Feifei Feng, Jian Tang

The language-conditioned robotic manipulation aims to transfer natural language instructions into executable actions, from simple pick-and-place to tasks requiring intent recognition and visual reasoning.

Decision Making Intent Recognition +2

Paper
Add Code

Object-Centric Instruction Augmentation for Robotic Manipulation

no code implementations • 5 Jan 2024 • Junjie Wen, Yichen Zhu, Minjie Zhu, Jinming Li, Zhiyuan Xu, Zhengping Che, Chaomin Shen, Yaxin Peng, Dong Liu, Feifei Feng, Jian Tang

Humans interpret scenes by recognizing both the identities and positions of objects in their observations.

Language Modelling Large Language Model +1

Paper
Add Code

DTF-Net: Category-Level Pose Estimation and Shape Reconstruction via Deformable Template Field

no code implementations • 4 Aug 2023 • Haowen Wang, Zhipeng Fan, Zhen Zhao, Zhengping Che, Zhiyuan Xu, Dong Liu, Feifei Feng, Yakun Huang, XIUQUAN QIAO, Jian Tang

We introduce a pose regression module that shares the deformation features and template codes from the fields to estimate the accurate 6D pose of each object in the scene.

Object Pose Estimation

Paper
Add Code

RDFC-GAN: RGB-Depth Fusion CycleGAN for Indoor Depth Completion

no code implementations • 6 Jun 2023 • Haowen Wang, Zhengping Che, Yufan Yang, Mingyuan Wang, Zhiyuan Xu, XIUQUAN QIAO, Mengshi Qi, Feifei Feng, Jian Tang

Raw depth images captured in indoor scenarios frequently exhibit extensive missing values due to the inherent limitations of the sensors and environments.

Depth Completion Transparent objects

Paper
Add Code

CMG-Net: An End-to-End Contact-Based Multi-Finger Dexterous Grasping Network

no code implementations • 23 Mar 2023 • Mingze Wei, Yaomin Huang, Zhiyuan Xu, Ning Liu, Zhengping Che, Xinyu Zhang, Chaomin Shen, Feifei Feng, Chun Shan, Jian Tang

Our work significantly outperforms the state-of-the-art for three-finger robotic hands.

Paper
Add Code

CP$^3$: Channel Pruning Plug-in for Point-based Networks

no code implementations • 23 Mar 2023 • Yaomin Huang, Ning Liu, Zhengping Che, Zhiyuan Xu, Chaomin Shen, Yaxin Peng, Guixu Zhang, Xinmei Liu, Feifei Feng, Jian Tang

CP$^3$ is elaborately designed to leverage the characteristics of point clouds and PNNs in order to enable 2D channel pruning methods for PNNs.

Paper
Add Code

CP3: Channel Pruning Plug-In for Point-Based Networks

no code implementations • CVPR 2023 • Yaomin Huang, Ning Liu, Zhengping Che, Zhiyuan Xu, Chaomin Shen, Yaxin Peng, Guixu Zhang, Xinmei Liu, Feifei Feng, Jian Tang

Directly implementing the 2D CNN channel pruning methods to PNNs undermine the performance of PNNs because of the different representations of 2D images and 3D point clouds as well as the network architecture disparity.

Paper
Add Code

Label-Guided Auxiliary Training Improves 3D Object Detector

1 code implementation • 24 Jul 2022 • Yaomin Huang, Xinmei Liu, Yichen Zhu, Zhiyuan Xu, Chaomin Shen, Zhengping Che, Guixu Zhang, Yaxin Peng, Feifei Feng, Jian Tang

Detecting 3D objects from point clouds is a practical yet challenging task that has attracted increasing attention recently.

3D Object Detection Object +1

Paper
Code

RGB-Depth Fusion GAN for Indoor Depth Completion

no code implementations • CVPR 2022 • Haowen Wang, Mingyuan Wang, Zhengping Che, Zhiyuan Xu, XIUQUAN QIAO, Mengshi Qi, Feifei Feng, Jian Tang

In this paper, we design a novel two-branch end-to-end fusion network, which takes a pair of RGB and incomplete depth images as input to predict a dense and completed depth map.

Depth Completion Transparent objects

Paper
Add Code

Make A Long Image Short: Adaptive Token Length for Vision Transformers

no code implementations • 3 Dec 2021 • Yichen Zhu, Yuqin Zhu, Jie Du, Yi Wang, Zhicai Ou, Feifei Feng, Jian Tang

The TLA enables the ReViT to process the image with the minimum sufficient number of tokens during inference.

Action Recognition Image Classification

Paper
Add Code

Training BatchNorm Only in Neural Architecture Search and Beyond

no code implementations • 1 Dec 2021 • Yichen Zhu, Jie Du, Yuqin Zhu, Yi Wang, Zhicai Ou, Feifei Feng, Jian Tang

Critically, there is no effort to understand 1) why training BatchNorm only can find the perform-well architectures with the reduced supernet-training time, and 2) what is the difference between the train-BN-only supernet and the standard-train supernet.

Fairness Neural Architecture Search

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.