Search Results for author: Qijun Chen

Found 34 papers, 11 papers with code

Online,Target-Free LiDAR-Camera Extrinsic Calibration via Cross-Modal Mask Matching

no code implementations28 Apr 2024 Zhiwei Huang, Yikang Zhang, Qijun Chen, Rui Fan

The cornerstone of our framework and toolbox is the cross-modal mask matching (C3M) algorithm, developed based on a state-of-the-art (SoTA) LVM and capable of generating sufficient and reliable matches.

Vision-and-Language Navigation via Causal Learning

1 code implementation16 Apr 2024 Liuyi Wang, Zongtao He, Ronghao Dang, Mengjiao Shen, Chengju Liu, Qijun Chen

In the pursuit of robust and generalizable environment perception and language understanding, the ubiquitous challenge of dataset bias continues to plague vision-and-language navigation (VLN) agents, hindering their performance in unseen environments.

Causal Inference Contrastive Learning +1

Playing to Vision Foundation Model's Strengths in Stereo Matching

no code implementations9 Apr 2024 Chuang-Wei Liu, Qijun Chen, Rui Fan

We believe this new paradigm will pave the way for the next generation of stereo matching networks.

Stereo Matching

HAPNet: Toward Superior RGB-Thermal Scene Parsing via Hybrid, Asymmetric, and Progressive Heterogeneous Feature Fusion

1 code implementation4 Apr 2024 Jiahang Li, Peng Yun, Qijun Chen, Rui Fan

In this study, we take one step toward this new research area by exploring a feasible strategy to fully exploit VFM features for RGB-thermal scene parsing.

Scene Parsing Semantic Segmentation +1

LIX: Implicitly Infusing Spatial Geometric Prior Knowledge into Visual Semantic Segmentation for Autonomous Driving

no code implementations13 Mar 2024 Sicen Guo, Zhiyuan Wu, Qijun Chen, Ioannis Pitas, Rui Fan

We introduce the Learning to Infuse "X" (LIX) framework, with novel contributions in both logit distillation and feature distillation aspects.

Autonomous Driving Knowledge Distillation +1

Causality-based Cross-Modal Representation Learning for Vision-and-Language Navigation

no code implementations6 Mar 2024 Liuyi Wang, Zongtao He, Ronghao Dang, Huiyi Chen, Chengju Liu, Qijun Chen

Vision-and-Language Navigation (VLN) has gained significant research interest in recent years due to its potential applications in real-world scenarios.

Representation Learning Vision and Language Navigation

SNE-RoadSegV2: Advancing Heterogeneous Feature Fusion and Fallibility Awareness for Freespace Detection

no code implementations29 Feb 2024 Yi Feng, Yu Ma, Qijun Chen, Ioannis Pitas, Rui Fan

Feature-fusion networks with duplex encoders have proven to be an effective technique to solve the freespace detection problem.

Computational Efficiency

CLIPose: Category-Level Object Pose Estimation with Pre-trained Vision-Language Knowledge

no code implementations24 Feb 2024 Xiao Lin, Minghao Zhu, Ronghao Dang, Guangliang Zhou, Shaolong Shu, Feng Lin, Chengju Liu, Qijun Chen

Inspired by this motivation, we propose CLIPose, a novel 6D pose framework that employs the pre-trained vision-language model to develop better learning of object category information, which can fully leverage abundant semantic knowledge in image and text modalities.

Contrastive Learning Language Modelling +2

S$^3$M-Net: Joint Learning of Semantic Segmentation and Stereo Matching for Autonomous Driving

no code implementations21 Jan 2024 Zhiyuan Wu, Yi Feng, Chuang-Wei Liu, Fisher Yu, Qijun Chen, Rui Fan

Hence, in this article, we introduce S$^3$M-Net, a novel joint learning framework developed to perform semantic segmentation and stereo matching simultaneously.

Autonomous Driving Scene Understanding +2

PICNN: A Pathway towards Interpretable Convolutional Neural Networks

1 code implementation19 Dec 2023 Wengang Guo, Jiayi Yang, Huilin Yin, Qijun Chen, Wei Ye

Experimental results have demonstrated that our method PICNN (the combination of standard CNNs with our proposed pathway) exhibits greater interpretability than standard CNNs while achieving higher or comparable discrimination power.

Three-Filters-to-Normal+: Revisiting Discontinuity Discrimination in Depth-to-Normal Translation

no code implementations13 Dec 2023 Jingwei Yang, Bohuan Xue, Yi Feng, Deming Wang, Rui Fan, Qijun Chen

This article introduces three-filters-to-normal+ (3F2N+), an extension of our previous work three-filters-to-normal (3F2N), with a specific focus on incorporating discontinuity discrimination capability into surface normal estimators (SNEs).

6D Pose Estimation using RGB Point Cloud Completion

TransPose: 6D Object Pose Estimation with Geometry-Aware Transformer

no code implementations25 Oct 2023 Xiao Lin, Deming Wang, Guangliang Zhou, Chengju Liu, Qijun Chen

To improve robustness to occlusion, we adopt Transformer to perform the exchange of global information, making each local feature contains global information.

6D Pose Estimation using RGB Object

InstructDET: Diversifying Referring Object Detection with Generalized Instructions

1 code implementation8 Oct 2023 Ronghao Dang, Jiangyan Feng, Haodong Zhang, Chongjian Ge, Lin Song, Lijun Gong, Chengju Liu, Qijun Chen, Feng Zhu, Rui Zhao, Yibing Song

In order to encompass common detection expressions, we involve emerging vision-language model (VLM) and large language model (LLM) to generate instructions guided by text prompts and object bbxs, as the generalizations of foundation models are effective to produce human-like expressions (e. g., describing object property, category, and relationship).

Language Modelling Large Language Model +4

Dive Deeper into Rectifying Homography for Stereo Camera Online Self-Calibration

no code implementations19 Sep 2023 Hongbo Zhao, Yikang Zhang, Qijun Chen, Rui Fan

Instead, we introduce four new evaluation metrics to quantify the robustness and accuracy of extrinsic parameter estimation, applicable to both single-pair and multi-pair cases.

Stereo Matching Visual Odometry

RoadFormer: Duplex Transformer for RGB-Normal Semantic Road Scene Parsing

no code implementations19 Sep 2023 Jiahang Li, Yikang Zhang, Peng Yun, Guangliang Zhou, Qijun Chen, Rui Fan

Additionally, we release SYN-UDTIRI, the first large-scale road scene parsing dataset that contains over 10, 407 RGB images, dense depth images, and the corresponding pixel-level annotations for both freespace and road defects of different shapes and sizes.

Scene Parsing

Fine-Grained Spatiotemporal Motion Alignment for Contrastive Video Representation Learning

1 code implementation1 Sep 2023 Minghao Zhu, Xiao Lin, Ronghao Dang, Chengju Liu, Qijun Chen

As the most essential property in a video, motion information is critical to a robust and generalized video representation.

Contrastive Learning Representation Learning

E3CM: Epipolar-Constrained Cascade Correspondence Matching

no code implementations31 Aug 2023 Chenbo Zhou, Shuai Su, Qijun Chen, Rui Fan

Accurate and robust correspondence matching is of utmost importance for various 3D computer vision tasks.

Freespace Optical Flow Modeling for Automated Driving

no code implementations29 Jul 2023 Yi Feng, Ruge Zhang, Jiayuan Du, Qijun Chen, Rui Fan

Additionally, our proposed freespace optical flow model boasts a diverse array of applications within the realm of automated driving, providing a geometric constraint in freespace detection, vehicle localization, and more.

Autonomous Driving Lane Detection +1

PASTS: Progress-Aware Spatio-Temporal Transformer Speaker For Vision-and-Language Navigation

no code implementations19 May 2023 Liuyi Wang, Chengju Liu, Zongtao He, Shu Li, Qingqing Yan, Huiyi Chen, Qijun Chen

The experimental results demonstrate that PASTS outperforms all existing speaker models and successfully improves the performance of previous VLN models, achieving state-of-the-art performance on the standard Room-to-Room (R2R) dataset.

Data Augmentation Vision and Language Navigation

A Dual Semantic-Aware Recurrent Global-Adaptive Network For Vision-and-Language Navigation

1 code implementation5 May 2023 Liuyi Wang, Zongtao He, Jiagui Tang, Ronghao Dang, Naijia Wang, Chengju Liu, Qijun Chen

Vision-and-Language Navigation (VLN) is a realistic but challenging task that requires an agent to locate the target region using verbal and visual cues.

Vision and Language Navigation

D2NT: A High-Performing Depth-to-Normal Translator

1 code implementation24 Apr 2023 Yi Feng, Bohuan Xue, Ming Liu, Qijun Chen, Rui Fan

Surface normal holds significant importance in visual environmental perception, serving as a source of rich geometric information.

Surface Normal Estimation Vocal Bursts Intensity Prediction

MLANet: Multi-Level Attention Network with Sub-instruction for Continuous Vision-and-Language Navigation

1 code implementation2 Mar 2023 Zongtao He, Liuyi Wang, Shu Li, Qingqing Yan, Chengju Liu, Qijun Chen

For a better performance in continuous VLN, we design a multi-level instruction understanding procedure and propose a novel model, Multi-Level Attention Network (MLANet).

Navigate Vision and Language Navigation

Multiple Thinking Achieving Meta-Ability Decoupling for Object Navigation

no code implementations3 Feb 2023 Ronghao Dang, Lu Chen, Liuyi Wang, Zongtao He, Chengju Liu, Qijun Chen

We propose a meta-ability decoupling (MAD) paradigm, which brings together various object navigation methods in an architecture system, allowing them to mutually enhance each other and evolve together.

Object

SIM2E: Benchmarking the Group Equivariant Capability of Correspondence Matching Algorithms

no code implementations21 Aug 2022 Shuai Su, Zhongkai Zhao, Yixin Fei, Shuda Li, Qijun Chen, Rui Fan

The experimental results demonstrate the importance of group equivariant algorithms for correspondence matching on various sim(2) transformation conditions.

Benchmarking

Multi-scale Wasserstein Shortest-path Graph Kernels for Graph Classification

1 code implementation2 Jun 2022 Wei Ye, Hao Tian, Qijun Chen

To mitigate the two challenges, we propose a novel graph kernel called the Multi-scale Wasserstein Shortest-Path graph kernel (MWSP), at the heart of which is the multi-scale shortest-path node feature map, of which each element denotes the number of occurrences of a shortest path around a node.

Graph Classification

Unbiased Directed Object Attention Graph for Object Navigation

no code implementations9 Apr 2022 Ronghao Dang, Zhuofan Shi, Liuyi Wang, Zongtao He, Chengju Liu, Qijun Chen

Thus, in this paper, we propose a directed object attention (DOA) graph to guide the agent in explicitly learning the attention relationships between objects, thereby reducing the object attention bias.

Object

Hyperspectral Imaging for cherry tomato

no code implementations10 Mar 2022 Yun Xiang, Qijun Chen, Zhongjin Su, Lu Zhang, Zuohui Chen, Guozhi Zhou, Zhuping Yao, Qi Xuan, Yuan Cheng

Cherry tomato (Solanum Lycopersicum) is popular with consumers over the world due to its special flavor.

regression

Human-vehicle Cooperative Visual Perception for Autonomous Driving under Complex Road and Traffic Scenarios

no code implementations17 Dec 2021 Yiyue Zhao, Cailin Lei, Yu Shen, Yuchuan Du, Qijun Chen

To enhance the visual perception capability of human-vehicle cooperative driving, this paper proposed a cooperative visual perception model.

Autonomous Driving object-detection +2

Revisiting Perspective Information for Efficient Crowd Counting

no code implementations CVPR 2019 Miaojing Shi, Zhaohui Yang, Chao Xu, Qijun Chen

Modern crowd counting methods employ deep neural networks to estimate crowd counts via crowd density regressions.

Crowd Counting regression

Novel View Synthesis for Large-scale Scene using Adversarial Loss

1 code implementation20 Feb 2018 Xiaochuan Yin, Henglai Wei, Penghong Lin, Xiangwei Wang, Qijun Chen

Novel view synthesis aims to synthesize new images from different viewpoints of given images.

Novel View Synthesis

Cannot find the paper you are looking for? You can Submit a new open access paper.