Search Results for author: Qijun Chen

Found 34 papers, 11 papers with code

Online,Target-Free LiDAR-Camera Extrinsic Calibration via Cross-Modal Mask Matching

no code implementations • 28 Apr 2024 • Zhiwei Huang, Yikang Zhang, Qijun Chen, Rui Fan

The cornerstone of our framework and toolbox is the cross-modal mask matching (C3M) algorithm, developed based on a state-of-the-art (SoTA) LVM and capable of generating sufficient and reliable matches.

Paper
Add Code

Vision-and-Language Navigation via Causal Learning

1 code implementation • 16 Apr 2024 • Liuyi Wang, Zongtao He, Ronghao Dang, Mengjiao Shen, Chengju Liu, Qijun Chen

In the pursuit of robust and generalizable environment perception and language understanding, the ubiquitous challenge of dataset bias continues to plague vision-and-language navigation (VLN) agents, hindering their performance in unseen environments.

Causal Inference Contrastive Learning +1

Paper
Code

Playing to Vision Foundation Model's Strengths in Stereo Matching

no code implementations • 9 Apr 2024 • Chuang-Wei Liu, Qijun Chen, Rui Fan

We believe this new paradigm will pave the way for the next generation of stereo matching networks.

Stereo Matching

Paper
Add Code

HAPNet: Toward Superior RGB-Thermal Scene Parsing via Hybrid, Asymmetric, and Progressive Heterogeneous Feature Fusion

1 code implementation • 4 Apr 2024 • Jiahang Li, Peng Yun, Qijun Chen, Rui Fan

In this study, we take one step toward this new research area by exploring a feasible strategy to fully exploit VFM features for RGB-thermal scene parsing.

Ranked #1 on Thermal Image Segmentation on KP day-night

Scene Parsing Semantic Segmentation +1

Paper
Code

LIX: Implicitly Infusing Spatial Geometric Prior Knowledge into Visual Semantic Segmentation for Autonomous Driving

no code implementations • 13 Mar 2024 • Sicen Guo, Zhiyuan Wu, Qijun Chen, Ioannis Pitas, Rui Fan

We introduce the Learning to Infuse "X" (LIX) framework, with novel contributions in both logit distillation and feature distillation aspects.

Autonomous Driving Knowledge Distillation +1

Paper
Add Code

Causality-based Cross-Modal Representation Learning for Vision-and-Language Navigation

no code implementations • 6 Mar 2024 • Liuyi Wang, Zongtao He, Ronghao Dang, Huiyi Chen, Chengju Liu, Qijun Chen

Vision-and-Language Navigation (VLN) has gained significant research interest in recent years due to its potential applications in real-world scenarios.

Representation Learning Vision and Language Navigation

Paper
Add Code

SNE-RoadSegV2: Advancing Heterogeneous Feature Fusion and Fallibility Awareness for Freespace Detection

no code implementations • 29 Feb 2024 • Yi Feng, Yu Ma, Qijun Chen, Ioannis Pitas, Rui Fan

Feature-fusion networks with duplex encoders have proven to be an effective technique to solve the freespace detection problem.

Computational Efficiency

Paper
Add Code

CLIPose: Category-Level Object Pose Estimation with Pre-trained Vision-Language Knowledge

no code implementations • 24 Feb 2024 • Xiao Lin, Minghao Zhu, Ronghao Dang, Guangliang Zhou, Shaolong Shu, Feng Lin, Chengju Liu, Qijun Chen

Inspired by this motivation, we propose CLIPose, a novel 6D pose framework that employs the pre-trained vision-language model to develop better learning of object category information, which can fully leverage abundant semantic knowledge in image and text modalities.

Contrastive Learning Language Modelling +2

Paper
Add Code

S$^3$M-Net: Joint Learning of Semantic Segmentation and Stereo Matching for Autonomous Driving

no code implementations • 21 Jan 2024 • Zhiyuan Wu, Yi Feng, Chuang-Wei Liu, Fisher Yu, Qijun Chen, Rui Fan

Hence, in this article, we introduce S$^3$M-Net, a novel joint learning framework developed to perform semantic segmentation and stereo matching simultaneously.

Autonomous Driving Scene Understanding +2

Paper
Add Code

PICNN: A Pathway towards Interpretable Convolutional Neural Networks

1 code implementation • 19 Dec 2023 • Wengang Guo, Jiayi Yang, Huilin Yin, Qijun Chen, Wei Ye

Experimental results have demonstrated that our method PICNN (the combination of standard CNNs with our proposed pathway) exhibits greater interpretability than standard CNNs while achieving higher or comparable discrimination power.

Paper
Code

Three-Filters-to-Normal+: Revisiting Discontinuity Discrimination in Depth-to-Normal Translation

no code implementations • 13 Dec 2023 • Jingwei Yang, Bohuan Xue, Yi Feng, Deming Wang, Rui Fan, Qijun Chen

This article introduces three-filters-to-normal+ (3F2N+), an extension of our previous work three-filters-to-normal (3F2N), with a specific focus on incorporating discontinuity discrimination capability into surface normal estimators (SNEs).

6D Pose Estimation using RGB Point Cloud Completion

Paper
Add Code

TransPose: 6D Object Pose Estimation with Geometry-Aware Transformer

no code implementations • 25 Oct 2023 • Xiao Lin, Deming Wang, Guangliang Zhou, Chengju Liu, Qijun Chen

To improve robustness to occlusion, we adopt Transformer to perform the exchange of global information, making each local feature contains global information.

6D Pose Estimation using RGB Object

Paper
Add Code

InstructDET: Diversifying Referring Object Detection with Generalized Instructions

1 code implementation • 8 Oct 2023 • Ronghao Dang, Jiangyan Feng, Haodong Zhang, Chongjian Ge, Lin Song, Lijun Gong, Chengju Liu, Qijun Chen, Feng Zhu, Rui Zhao, Yibing Song

In order to encompass common detection expressions, we involve emerging vision-language model (VLM) and large language model (LLM) to generate instructions guided by text prompts and object bbxs, as the generalizations of foundation models are effective to produce human-like expressions (e. g., describing object property, category, and relationship).

Language Modelling Large Language Model +4

Paper
Code

Dive Deeper into Rectifying Homography for Stereo Camera Online Self-Calibration

no code implementations • 19 Sep 2023 • Hongbo Zhao, Yikang Zhang, Qijun Chen, Rui Fan

Instead, we introduce four new evaluation metrics to quantify the robustness and accuracy of extrinsic parameter estimation, applicable to both single-pair and multi-pair cases.

Stereo Matching Visual Odometry

Paper
Add Code

RoadFormer: Duplex Transformer for RGB-Normal Semantic Road Scene Parsing

no code implementations • 19 Sep 2023 • Jiahang Li, Yikang Zhang, Peng Yun, Guangliang Zhou, Qijun Chen, Rui Fan

Additionally, we release SYN-UDTIRI, the first large-scale road scene parsing dataset that contains over 10, 407 RGB images, dense depth images, and the corresponding pixel-level annotations for both freespace and road defects of different shapes and sizes.

Scene Parsing

Paper
Add Code

Fine-Grained Spatiotemporal Motion Alignment for Contrastive Video Representation Learning

1 code implementation • 1 Sep 2023 • Minghao Zhu, Xiao Lin, Ronghao Dang, Chengju Liu, Qijun Chen

As the most essential property in a video, motion information is critical to a robust and generalized video representation.

Contrastive Learning Representation Learning

Paper
Code

E3CM: Epipolar-Constrained Cascade Correspondence Matching

no code implementations • 31 Aug 2023 • Chenbo Zhou, Shuai Su, Qijun Chen, Rui Fan

Accurate and robust correspondence matching is of utmost importance for various 3D computer vision tasks.

Paper
Add Code

Freespace Optical Flow Modeling for Automated Driving

no code implementations • 29 Jul 2023 • Yi Feng, Ruge Zhang, Jiayuan Du, Qijun Chen, Rui Fan

Additionally, our proposed freespace optical flow model boasts a diverse array of applications within the realm of automated driving, providing a geometric constraint in freespace detection, vehicle localization, and more.

Autonomous Driving Lane Detection +1

Paper
Add Code

LoSh: Long-Short Text Joint Prediction Network for Referring Video Object Segmentation

1 code implementation • 14 Jun 2023 • Linfeng Yuan, Miaojing Shi, Zijie Yue, Qijun Chen

Referring video object segmentation (RVOS) aims to segment the target instance referred by a given text expression in a video clip.

Ranked #12 on Referring Expression Segmentation on Refer-YouTube-VOS (2021 public validation) (using extra training data)

Referring Expression Segmentation Referring Video Object Segmentation +2

Paper
Code

PASTS: Progress-Aware Spatio-Temporal Transformer Speaker For Vision-and-Language Navigation

no code implementations • 19 May 2023 • Liuyi Wang, Chengju Liu, Zongtao He, Shu Li, Qingqing Yan, Huiyi Chen, Qijun Chen

The experimental results demonstrate that PASTS outperforms all existing speaker models and successfully improves the performance of previous VLN models, achieving state-of-the-art performance on the standard Room-to-Room (R2R) dataset.

Data Augmentation Vision and Language Navigation

Paper
Add Code

A Dual Semantic-Aware Recurrent Global-Adaptive Network For Vision-and-Language Navigation

1 code implementation • 5 May 2023 • Liuyi Wang, Zongtao He, Jiagui Tang, Ronghao Dang, Naijia Wang, Chengju Liu, Qijun Chen

Vision-and-Language Navigation (VLN) is a realistic but challenging task that requires an agent to locate the target region using verbal and visual cues.

Vision and Language Navigation

Paper
Code

D2NT: A High-Performing Depth-to-Normal Translator

1 code implementation • 24 Apr 2023 • Yi Feng, Bohuan Xue, Ming Liu, Qijun Chen, Rui Fan

Surface normal holds significant importance in visual environmental perception, serving as a source of rich geometric information.

Surface Normal Estimation Vocal Bursts Intensity Prediction

Paper
Code

UDTIRI: An Online Open-Source Intelligent Road Inspection Benchmark Suite

no code implementations • 18 Apr 2023 • Sicen Guo, Jiahang Li, Yi Feng, Dacheng Zhou, Denghuang Zhang, Chen Chen, Shuai Su, Xingyi Zhu, Qijun Chen, Rui Fan

To foster advancements in this burgeoning field, we have launched an online open-source benchmark suite, referred to as UDTIRI.

Benchmarking Instance Segmentation +3

Paper
Add Code

MLANet: Multi-Level Attention Network with Sub-instruction for Continuous Vision-and-Language Navigation

1 code implementation • 2 Mar 2023 • Zongtao He, Liuyi Wang, Shu Li, Qingqing Yan, Chengju Liu, Qijun Chen

For a better performance in continuous VLN, we design a multi-level instruction understanding procedure and propose a novel model, Multi-Level Attention Network (MLANet).

Navigate Vision and Language Navigation

Paper
Code

Multiple Thinking Achieving Meta-Ability Decoupling for Object Navigation

no code implementations • 3 Feb 2023 • Ronghao Dang, Lu Chen, Liuyi Wang, Zongtao He, Chengju Liu, Qijun Chen

We propose a meta-ability decoupling (MAD) paradigm, which brings together various object navigation methods in an architecture system, allowing them to mutually enhance each other and evolve together.

Object

Paper
Add Code

SIM2E: Benchmarking the Group Equivariant Capability of Correspondence Matching Algorithms

no code implementations • 21 Aug 2022 • Shuai Su, Zhongkai Zhao, Yixin Fei, Shuda Li, Qijun Chen, Rui Fan

The experimental results demonstrate the importance of group equivariant algorithms for correspondence matching on various sim(2) transformation conditions.

Benchmarking

Paper
Add Code

Search for or Navigate to? Dual Adaptive Thinking for Object Navigation

no code implementations • ICCV 2023 • Ronghao Dang, Liuyi Wang, Zongtao He, Shuai Su, Chengju Liu, Qijun Chen

After seeing the target, we remember the target location and navigate to.

Navigate Object

Paper
Add Code

Multi-scale Wasserstein Shortest-path Graph Kernels for Graph Classification

1 code implementation • 2 Jun 2022 • Wei Ye, Hao Tian, Qijun Chen

To mitigate the two challenges, we propose a novel graph kernel called the Multi-scale Wasserstein Shortest-Path graph kernel (MWSP), at the heart of which is the multi-scale shortest-path node feature map, of which each element denotes the number of occurrences of a shortest path around a node.

Graph Classification

Paper
Code

Unbiased Directed Object Attention Graph for Object Navigation

no code implementations • 9 Apr 2022 • Ronghao Dang, Zhuofan Shi, Liuyi Wang, Zongtao He, Chengju Liu, Qijun Chen

Thus, in this paper, we propose a directed object attention (DOA) graph to guide the agent in explicitly learning the attention relationships between objects, thereby reducing the object attention bias.

Object

Paper
Add Code

Hyperspectral Imaging for cherry tomato

no code implementations • 10 Mar 2022 • Yun Xiang, Qijun Chen, Zhongjin Su, Lu Zhang, Zuohui Chen, Guozhi Zhou, Zhuping Yao, Qi Xuan, Yuan Cheng

Cherry tomato (Solanum Lycopersicum) is popular with consumers over the world due to its special flavor.

regression

Paper
Add Code

Human-vehicle Cooperative Visual Perception for Autonomous Driving under Complex Road and Traffic Scenarios

no code implementations • 17 Dec 2021 • Yiyue Zhao, Cailin Lei, Yu Shen, Yuchuan Du, Qijun Chen

To enhance the visual perception capability of human-vehicle cooperative driving, this paper proposed a cooperative visual perception model.

Autonomous Driving object-detection +2

Paper
Add Code

Revisiting Perspective Information for Efficient Crowd Counting

no code implementations • CVPR 2019 • Miaojing Shi, Zhaohui Yang, Chao Xu, Qijun Chen

Modern crowd counting methods employ deep neural networks to estimate crowd counts via crowd density regressions.

Crowd Counting regression

Paper
Add Code

Novel View Synthesis for Large-scale Scene using Adversarial Loss

1 code implementation • 20 Feb 2018 • Xiaochuan Yin, Henglai Wei, Penghong Lin, Xiangwei Wang, Qijun Chen

Novel view synthesis aims to synthesize new images from different viewpoints of given images.

Novel View Synthesis

175

Paper
Code

Scale Recovery for Monocular Visual Odometry Using Depth Estimated With Deep Convolutional Neural Fields

no code implementations • ICCV 2017 • Xiaochuan Yin, Xiangwei Wang, Xiaoguo Du, Qijun Chen

Normally, road plane and camera height are specified as reference to recover the scale.

Depth Estimation Monocular Visual Odometry

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.