Search Results for author: Jenq-Neng Hwang

Found 73 papers, 17 papers with code

MovieChat+: Question-aware Sparse Memory for Long Video Question Answering

1 code implementation • 26 Apr 2024 • Enxin Song, Wenhao Chai, Tian Ye, Jenq-Neng Hwang, Xi Li, Gaoang Wang

Recently, integrating video foundation models and large language models to build a video understanding system can overcome the limitations of specific pre-defined vision tasks.

Ranked #2 on Question Answering on NExT-QA (Open-ended VideoQA)

2k Question Answering +2

398

Paper
Code

Single-image driven 3d viewpoint training data augmentation for effective wine label recognition

no code implementations • 12 Apr 2024 • Yueh-Cheng Huang, Hsin-Yi Chen, Cheng-Jui Hung, Jen-Hui Chuang, Jenq-Neng Hwang

Confronting the critical challenge of insufficient training data in the field of complex image recognition, this paper introduces a novel 3D viewpoint augmentation technique specifically tailored for wine label recognition.

Data Augmentation Generative Adversarial Network +1

Paper
Add Code

MonoTAKD: Teaching Assistant Knowledge Distillation for Monocular 3D Object Detection

no code implementations • 7 Apr 2024 • Hou-I Liu, Christine Wu, Jen-Hao Cheng, Wenhao Chai, Shian-Yun Wang, Gaowen Liu, Jenq-Neng Hwang, Hong-Han Shuai, Wen-Huang Cheng

Subsequently, we introduce the cross-modal residual distillation to transfer the 3D spatial cues.

Autonomous Driving Knowledge Distillation +3

Paper
Add Code

VersaT2I: Improving Text-to-Image Models with Versatile Reward

no code implementations • 27 Mar 2024 • Jianshu Guo, Wenhao Chai, Jie Deng, Hsiang-Wei Huang, Tian Ye, Yichen Xu, Jiawei Zhang, Jenq-Neng Hwang, Gaoang Wang

Recent text-to-image (T2I) models have benefited from large-scale and high-quality data, demonstrating impressive performance.

Paper
Add Code

Exploring Learning-based Motion Models in Multi-Object Tracking

no code implementations • 16 Mar 2024 • Hsiang-Wei Huang, Cheng-Yen Yang, Wenhao Chai, Zhongyu Jiang, Jenq-Neng Hwang

In the field of multi-object tracking (MOT), traditional methods often rely on the Kalman Filter for motion prediction, leveraging its strengths in linear motion scenarios.

motion prediction Multi-Object Tracking

Paper
Add Code

Contrastive Pre-Training with Multi-View Fusion for No-Reference Point Cloud Quality Assessment

no code implementations • 15 Mar 2024 • Ziyu Shan, Yujie Zhang, Qi Yang, Haichen Yang, Yiling Xu, Jenq-Neng Hwang, Xiaozhong Xu, Shan Liu

Furthermore, in the model fine-tuning stage, we propose a semantic-guided multi-view fusion module to effectively integrate the features of projected images from multiple perspectives.

Philosophy Point Cloud Quality Assessment

Paper
Add Code

A Density-Guided Temporal Attention Transformer for Indiscernible Object Counting in Underwater Video

no code implementations • 6 Mar 2024 • Cheng-Yen Yang, Hsiang-Wei Huang, Zhongyu Jiang, Hao Wang, Farron Wallace, Jenq-Neng Hwang

Dense object counting or crowd counting has come a long way thanks to the recent development in the vision community.

Benchmarking Crowd Counting +2

Paper
Add Code

Tree Counting by Bridging 3D Point Clouds with Imagery

no code implementations • 4 Mar 2024 • Lei LI, Tianfang Zhang, Zhongyu Jiang, Cheng-Yen Yang, Jenq-Neng Hwang, Stefan Oehmcke, Dimitri Pierre Johannes Gominski, Fabian Gieseke, Christian Igel

We leverage the fusion of three-dimensional LiDAR measurements and 2D imagery to facilitate the accurate counting of trees.

Management

Paper
Add Code

CityGen: Infinite and Controllable 3D City Layout Generation

no code implementations • 3 Dec 2023 • Jie Deng, Wenhao Chai, Jianshu Guo, Qixuan Huang, Wenhao Hu, Jenq-Neng Hwang, Gaoang Wang

In this paper, we propose CityGen, a novel end-to-end framework for infinite, diverse and controllable 3D city layout generation. First, we propose an outpainting pipeline to extend the local layout to an infinite city layout.

Paper
Add Code

See and Think: Embodied Agent in Virtual Environment

no code implementations • 26 Nov 2023 • Zhonghan Zhao, Wenhao Chai, Xuan Wang, Li Boyi, Shengyu Hao, Shidong Cao, Tian Ye, Jenq-Neng Hwang, Gaoang Wang

Vision perception involves the interpretation of visual information in the environment, which is then integrated into the LLMs component with agent state and task instruction.

Question Answering Retrieval

Paper
Add Code

UniHPE: Towards Unified Human Pose Estimation via Contrastive Learning

no code implementations • 24 Nov 2023 • Zhongyu Jiang, Wenhao Chai, Lei LI, Zhuoran Zhou, Cheng-Yen Yang, Jenq-Neng Hwang

In this paper, we propose UniHPE, a unified Human Pose Estimation pipeline, which aligns features from all three modalities, i. e., 2D human pose estimation, lifting-based and image-based 3D human pose estimation, in the same pipeline.

2D Human Pose Estimation 3D Human Pose Estimation +3

Paper
Add Code

The 2nd Workshop on Maritime Computer Vision (MaCVi) 2024

no code implementations • 23 Nov 2023 • Benjamin Kiefer, Lojze Žust, Matej Kristan, Janez Perš, Matija Teršek, Arnold Wiliem, Martin Messmer, Cheng-Yen Yang, Hsiang-Wei Huang, Zhongyu Jiang, Heng-Cheng Kuo, Jie Mei, Jenq-Neng Hwang, Daniel Stadler, Lars Sommer, Kaer Huang, Aiguo Zheng, Weitu Chong, Kanokphan Lertniphonphan, Jun Xie, Feng Chen, Jian Li, Zhepeng Wang, Luca Zedda, Andrea Loddo, Cecilia Di Ruberto, Tuan-Anh Vu, Hai Nguyen-Truong, Tan-Sang Ha, Quan-Dung Pham, Sai-Kit Yeung, Yuan Feng, Nguyen Thanh Thien, Lixin Tian, Sheng-Yao Kuan, Yuan-Hao Ho, Angel Bueno Rodriguez, Borja Carrillo-Perez, Alexander Klein, Antje Alex, Yannik Steiniger, Felix Sattler, Edgardo Solano-Carrillo, Matej Fabijanić, Magdalena Šumunec, Nadir Kapetanović, Andreas Michel, Wolfgang Gross, Martin Weinmann

The 2nd Workshop on Maritime Computer Vision (MaCVi) 2024 addresses maritime computer vision for Unmanned Aerial Vehicles (UAV) and Unmanned Surface Vehicles (USV).

Ranked #1 on Semantic Segmentation on LaRS

Object Tracking Segmentation +1

Paper
Add Code

Efficient Domain Adaptation via Generative Prior for 3D Infant Pose Estimation

1 code implementation • 17 Nov 2023 • Zhuoran Zhou, Zhongyu Jiang, Wenhao Chai, Cheng-Yen Yang, Lei LI, Jenq-Neng Hwang

We further apply a guided diffusion model to domain adapt 3D adult pose to infant pose to supplement small datasets.

3D Human Pose Estimation Data Augmentation +1

Paper
Code

Vision meets mmWave Radar: 3D Object Perception Benchmark for Autonomous Driving

no code implementations • 17 Nov 2023 • Yizhou Wang, Jen-Hao Cheng, Jui-Te Huang, Sheng-Yao Kuan, Qiqian Fu, Chiming Ni, Shengyu Hao, Gaoang Wang, Guanbin Xing, Hui Liu, Jenq-Neng Hwang

This kind of radar format can enable machine learning models to generate more reliable object perception results after interacting and fusing the information or features between the camera and radar.

Autonomous Driving Sensor Fusion

Paper
Add Code

Sea You Later: Metadata-Guided Long-Term Re-Identification for UAV-Based Multi-Object Tracking

no code implementations • 6 Nov 2023 • Cheng-Yen Yang, Hsiang-Wei Huang, Zhongyu Jiang, Heng-Cheng Kuo, Jie Mei, Chung-I Huang, Jenq-Neng Hwang

Re-identification (ReID) in multi-object tracking (MOT) for UAVs in maritime computer vision has been challenging for several reasons.

Multi-Object Tracking

Paper
Add Code

CenterRadarNet: Joint 3D Object Detection and Tracking Framework using 4D FMCW Radar

no code implementations • 2 Nov 2023 • Jen-Hao Cheng, Sheng-Yao Kuan, Hugo Latapie, Gaowen Liu, Jenq-Neng Hwang

CenterRadarNet achieves the state-of-the-art result on the K-Radar 3D object detection benchmark.

3D Object Detection 3D Object Tracking +5

Paper
Add Code

UniAP: Towards Universal Animal Perception in Vision via Few-shot Learning

no code implementations • 19 Aug 2023 • Meiqi Sun, Zhonghan Zhao, Wenhao Chai, Hanjun Luo, Shidong Cao, Yanting Zhang, Jenq-Neng Hwang, Gaoang Wang

Our proposed model takes support images and labels as prompt guidance for a query image.

Decoder Few-Shot Learning +1

Paper
Add Code

MovieChat: From Dense Token to Sparse Memory for Long Video Understanding

1 code implementation • 31 Jul 2023 • Enxin Song, Wenhao Chai, Guanhong Wang, Yucheng Zhang, Haoyang Zhou, Feiyang Wu, Haozhe Chi, Xun Guo, Tian Ye, Yanting Zhang, Yan Lu, Jenq-Neng Hwang, Gaoang Wang

Recently, integrating video foundation models and large language models to build a video understanding system can overcome the limitations of specific pre-defined vision tasks.

Ranked #1 on zero-shot long video global-mode question answering on MovieChat-1K

Video-based Generative Performance Benchmarking (Consistency) Video-based Generative Performance Benchmarking (Contextual Understanding) +10

398

Paper
Code

A Survey of Deep Learning in Sports Applications: Perception, Comprehension, and Decision

no code implementations • 7 Jul 2023 • Zhonghan Zhao, Wenhao Chai, Shengyu Hao, Wenhao Hu, Guanhong Wang, Shidong Cao, Mingli Song, Jenq-Neng Hwang, Gaoang Wang

Deep learning has the potential to revolutionize sports performance, with applications ranging from perception and comprehension to decision.

Paper
Add Code

Back to Optimization: Diffusion-based Zero-Shot 3D Human Pose Estimation

1 code implementation • 7 Jul 2023 • Zhongyu Jiang, Zhuoran Zhou, Lei LI, Wenhao Chai, Cheng-Yen Yang, Jenq-Neng Hwang

Learning-based methods have dominated the 3D human pose estimation (HPE) tasks with significantly better performance in most benchmarks than traditional optimization-based methods.

Ranked #10 on 3D Human Pose Estimation on 3DPW (PA-MPJPE metric)

3D Human Pose Estimation Image to 3D

Paper
Code

MPM: A Unified 2D-3D Human Pose Representation via Masked Pose Modeling

no code implementations • 29 Jun 2023 • Zhenyu Zhang, Wenhao Chai, Zhongyu Jiang, Tian Ye, Mingli Song, Jenq-Neng Hwang, Gaoang Wang

In this paper, we propose MPM, a unified 2D-3D human pose representation framework via masked pose modeling.

3D Human Pose Estimation 3D Pose Estimation

Paper
Add Code

Iterative Scale-Up ExpansionIoU and Deep Features Association for Multi-Object Tracking in Sports

1 code implementation • 22 Jun 2023 • Hsiang-Wei Huang, Cheng-Yen Yang, Jiacheng Sun, Pyong-Kun Kim, Kwang-Ju Kim, Kyoungoh Lee, Chung-I Huang, Jenq-Neng Hwang

Additionally, relying on the Kalman filter in recent tracking algorithms falls short when object motion defies its linear assumption.

Ranked #1 on Multiple Object Tracking on SportsMOT (using extra training data)

Multi-Object Tracking Multiple Object Tracking +3

Paper
Code

Learning Dynamic Point Cloud Compression via Hierarchical Inter-frame Block Matching

no code implementations • 9 May 2023 • Shuting Xia, Tingyu Fan, Yiling Xu, Jenq-Neng Hwang, Zhu Li

3D dynamic point cloud (DPC) compression relies on mining its temporal context, which faces significant challenges due to DPC's sparsity and non-uniform structure.

Feature Correlation Motion Compensation +2

Paper
Add Code

Enhancing Multi-Camera People Tracking with Anchor-Guided Clustering and Spatio-Temporal Consistency ID Re-Assignment

2 code implementations • 19 Apr 2023 • Hsiang-Wei Huang, Cheng-Yen Yang, Zhongyu Jiang, Pyong-Kun Kim, Kyoungoh Lee, Kwangju Kim, Samartha Ramkumar, Chaitanya Mullapudi, In-Su Jang, Chung-I Huang, Jenq-Neng Hwang

Multi-camera multiple people tracking has become an increasingly important area of research due to the growing demand for accurate and efficient indoor people tracking systems, particularly in settings such as retail, healthcare centers, and transit hubs.

Multiple People Tracking

170

Paper
Code

Multi-Object Tracking by Iteratively Associating Detections with Uniform Appearance for Trawl-Based Fishing Bycatch Monitoring

no code implementations • 10 Apr 2023 • Cheng-Yen Yang, Alan Yu Shyang Tan, Melanie J. Underwood, Charlotte Bodie, Zhongyu Jiang, Steve George, Karl Warr, Jenq-Neng Hwang, Emma Jones

The aim of in-trawl catch monitoring for use in fishing operations is to detect, track and classify fish targets in real-time from video footage.

Multi-Object Tracking

Paper
Add Code

Global Adaptation meets Local Generalization: Unsupervised Domain Adaptation for 3D Human Pose Estimation

1 code implementation • ICCV 2023 • Wenhao Chai, Zhongyu Jiang, Jenq-Neng Hwang, Gaoang Wang

We observe that the degradation is caused by two factors: 1) the large distribution gap over global positions of poses between the source and target datasets due to variant camera parameters and settings, and 2) the deficient diversity of local structures of poses in training.

Ranked #1 on 3D Human Pose Estimation in Limited Data on Human3.6M

3D Human Pose Estimation 3D Human Pose Estimation in Limited Data +3

Paper
Code

DIVOTrack: A Novel Dataset and Baseline Method for Cross-View Multi-Object Tracking in DIVerse Open Scenes

2 code implementations • 15 Feb 2023 • Shenghao Hao, Peiyuan Liu, Yibing Zhan, Kaixun Jin, Zuozhu Liu, Mingli Song, Jenq-Neng Hwang, Gaoang Wang

Although cross-view multi-object tracking has received increased attention in recent years, existing datasets still have several issues, including 1) missing real-world scenarios, 2) lacking diverse scenes, 3) owning a limited number of tracks, 4) comprising only static cameras, and 5) lacking standard benchmarks, which hinder the investigation and comparison of cross-view tracking methods.

Multi-Object Tracking Object +2

Paper
Code

Multi-target multi-camera vehicle tracking using transformer-based camera link model and spatial-temporal information

no code implementations • 18 Jan 2023 • Hsiang-Wei Huang, Cheng-Yen Yang, Jenq-Neng Hwang

Multi-target multi-camera tracking (MTMCT) of vehicles, i. e. tracking vehicles across multiple cameras, is a crucial application for the development of smart city and intelligent traffic system.

Paper
Add Code

CameraPose: Weakly-Supervised Monocular 3D Human Pose Estimation by Leveraging In-the-wild 2D Annotations

no code implementations • 8 Jan 2023 • Cheng-Yen Yang, Jiajia Luo, Lu Xia, Yuyin Sun, Nan Qiao, Ke Zhang, Zhongyu Jiang, Jenq-Neng Hwang

By adding a camera parameter branch, any in-the-wild 2D annotations can be fed into our pipeline to boost the training diversity and the 3D poses can be implicitly learned by reprojecting back to 2D.

Ranked #68 on 3D Human Pose Estimation on Human3.6M

Data Augmentation Monocular 3D Human Pose Estimation

Paper
Add Code

1st Workshop on Maritime Computer Vision (MaCVi) 2023: Challenge Results

no code implementations • 24 Nov 2022 • Benjamin Kiefer, Matej Kristan, Janez Perš, Lojze Žust, Fabio Poiesi, Fabio Augusto de Alcantara Andrade, Alexandre Bernardino, Matthew Dawkins, Jenni Raitoharju, Yitong Quan, Adem Atmaca, Timon Höfer, Qiming Zhang, Yufei Xu, Jing Zhang, DaCheng Tao, Lars Sommer, Raphael Spraul, Hangyue Zhao, Hongpu Zhang, Yanyun Zhao, Jan Lukas Augustin, Eui-ik Jeon, Impyeong Lee, Luca Zedda, Andrea Loddo, Cecilia Di Ruberto, Sagar Verma, Siddharth Gupta, Shishir Muralidhara, Niharika Hegde, Daitao Xing, Nikolaos Evangeliou, Anthony Tzes, Vojtěch Bartl, Jakub Špaňhel, Adam Herout, Neelanjan Bhowmik, Toby P. Breckon, Shivanand Kundargi, Tejas Anvekar, Chaitra Desai, Ramesh Ashok Tabib, Uma Mudengudi, Arpita Vats, Yang song, Delong Liu, Yonglin Li, Shuman Li, Chenhao Tan, Long Lan, Vladimir Somers, Christophe De Vleeschouwer, Alexandre Alahi, Hsiang-Wei Huang, Cheng-Yen Yang, Jenq-Neng Hwang, Pyong-Kun Kim, Kwangju Kim, Kyoungoh Lee, Shuai Jiang, Haiwen Li, Zheng Ziqiang, Tuan-Anh Vu, Hai Nguyen-Truong, Sai-Kit Yeung, Zhuang Jia, Sophia Yang, Chih-Chung Hsu, Xiu-Yu Hou, Yu-An Jhang, Simon Yang, Mau-Tsuen Yang

The 1$^{\text{st}}$ Workshop on Maritime Computer Vision (MaCVi) 2023 focused on maritime computer vision for Unmanned Aerial Vehicles (UAV) and Unmanned Surface Vehicle (USV), and organized several subchallenges in this domain: (i) UAV-based Maritime Object Detection, (ii) UAV-based Maritime Object Tracking, (iii) USV-based Maritime Obstacle Segmentation and (iv) USV-based Maritime Obstacle Detection.

Object object-detection +2

Paper
Add Code

HuPR: A Benchmark for Human Pose Estimation Using Millimeter Wave Radar

1 code implementation • 22 Oct 2022 • Shih-Po Lee, Niraj Prakash Kini, Wen-Hsiao Peng, Ching-Wen Ma, Jenq-Neng Hwang

In addition to the benchmark, we propose a cross-modality training framework that leverages the ground-truth 2D keypoints representing human body joints for training, which are systematically generated from the pre-trained 2D pose estimation network based on a monocular camera input image, avoiding laborious manual label annotation efforts.

2D Pose Estimation Pose Estimation

Paper
Code

Image-Text Retrieval with Binary and Continuous Label Supervision

no code implementations • 20 Oct 2022 • Zheng Li, Caili Guo, Zerun Feng, Jenq-Neng Hwang, Ying Jin, Yufeng Zhang

Such a binary indicator covers only a limited subset of image-text semantic relations, which is insufficient to represent relevance degrees between images and texts described by continuous labels such as image captions.

Image Captioning Retrieval +2

Paper
Add Code

Unified Loss of Pair Similarity Optimization for Vision-Language Retrieval

no code implementations • 28 Sep 2022 • Zheng Li, Caili Guo, Xin Wang, Zerun Feng, Jenq-Neng Hwang, Zhongtian Du

More specifically, Triplet loss with Hard Negative mining (Triplet-HN), which is widely used in existing retrieval models to improve the discriminative ability, is easy to fall into local minima in training.

Contrastive Learning Retrieval +2

Paper
Add Code

Observation Centric and Central Distance Recovery on Sports Player Tracking

1 code implementation • 27 Sep 2022 • Hsiang-Wei Huang, Cheng-Yen Yang, Jenq-Neng Hwang, Pyong-Kun Kim, Kwangju Kim, Kyoungoh Lee

Multi-Object Tracking over humans has improved rapidly with the development of object detection and re-identification.

Multi-Object Tracking Object +2

Paper
Code

GaitTAKE: Gait Recognition by Temporal Attention and Keypoint-guided Embedding

no code implementations • 7 Jul 2022 • Hung-Min Hsu, Yizhou Wang, Cheng-Yen Yang, Jenq-Neng Hwang, Hoang Le Uyen Thuc, Kwang-Ju Kim

Gait recognition, which refers to the recognition or identification of a person based on their body shape and walking styles, derived from video data captured from a distance, is widely used in crime prevention, forensic identification, and social security.

Gait Recognition

Paper
Add Code

GLIPv2: Unifying Localization and Vision-Language Understanding

1 code implementation • 12 Jun 2022 • Haotian Zhang, Pengchuan Zhang, Xiaowei Hu, Yen-Chun Chen, Liunian Harold Li, Xiyang Dai, Lijuan Wang, Lu Yuan, Jenq-Neng Hwang, Jianfeng Gao

We present GLIPv2, a grounded VL understanding model, that serves both localization tasks (e. g., object detection, instance segmentation) and Vision-Language (VL) understanding tasks (e. g., VQA, image captioning).

Ranked #1 on Phrase Grounding on Flickr30k Entities Test (using extra training data)

Contrastive Learning Image Captioning +7

1,980

Paper
Code

Recent Advances in Embedding Methods for Multi-Object Tracking: A Survey

no code implementations • 22 May 2022 • Gaoang Wang, Mingli Song, Jenq-Neng Hwang

Multi-object tracking (MOT) aims to associate target objects across video frames in order to obtain entire moving trajectories.

Image Classification Multi-Object Tracking +3

Paper
Add Code

Unsupervised Domain Adaptation Learning for Hierarchical Infant Pose Recognition with Synthetic Data

no code implementations • 4 May 2022 • Cheng-Yen Yang, Zhongyu Jiang, Shih-Yu Gu, Jenq-Neng Hwang, Jang-Hee Yoo

Due to limited public infant-related datasets, many works use the SMIL-based method to generate synthetic infant images for training.

Unsupervised Domain Adaptation

Paper
Add Code

TR-MOT: Multi-Object Tracking by Reference

no code implementations • 30 Mar 2022 • Mingfei Chen, Yue Liao, Si Liu, Fei Wang, Jenq-Neng Hwang

RS takes previous detected results as references to aggregate the corresponding features from the combined features of the adjacent frames and makes a one-to-one track state prediction for each reference in parallel.

Multi-Object Tracking Object

Paper
Add Code

The Overlooked Classifier in Human-Object Interaction Recognition

no code implementations • 10 Mar 2022 • Ying Jin, Yinpeng Chen, Lijuan Wang, JianFeng Wang, Pei Yu, Lin Liang, Jenq-Neng Hwang, Zicheng Liu

Human-Object Interaction (HOI) recognition is challenging due to two factors: (1) significant imbalance across classes and (2) requiring multiple labels per image.

Classification Human-Object Interaction Detection +4

Paper
Add Code

HCIL: Hierarchical Class Incremental Learning for Longline Fishing Visual Monitoring

no code implementations • 25 Feb 2022 • Jie Mei, Suzanne Romain, Craig Rose, Kelsey Magrane, Jenq-Neng Hwang

The goal of electronic monitoring of longline fishing is to visually monitor the fish catching activities on fishing vessels based on cameras, either for regulatory compliance or catch counting.

Classification Class Incremental Learning +1

Paper
Add Code

Unsupervised Severely Deformed Mesh Reconstruction (DMR) from a Single-View Image

no code implementations • 23 Jan 2022 • Jie Mei, Jingxi Yu, Suzanne Romain, Craig Rose, Kelsey Magrane, Graeme LeeSon, Jenq-Neng Hwang

Much progress has been made in the supervised learning of 3D reconstruction of rigid objects from multi-view images or a video.

3D Reconstruction

Paper
Add Code

The Overlooked Classifier in Human-Object Interaction Recognition

no code implementations • arXiv 2021 • Ying Jin, Yinpeng Chen, Lijuan Wang, JianFeng Wang, Pei Yu, Lin Liang, Jenq-Neng Hwang, Zicheng Liu

Human-Object Interaction (HOI) recognition is challenging due to two factors: (1) significant imbalance across classes and (2) requiring multiple labels per image.

Ranked #1 on Human-Object Interaction Detection on HICO

Classification Human-Object Interaction Detection +4

Paper
Add Code

Grounded Language-Image Pre-training

2 code implementations • CVPR 2022 • Liunian Harold Li, Pengchuan Zhang, Haotian Zhang, Jianwei Yang, Chunyuan Li, Yiwu Zhong, Lijuan Wang, Lu Yuan, Lei Zhang, Jenq-Neng Hwang, Kai-Wei Chang, Jianfeng Gao

The unification brings two benefits: 1) it allows GLIP to learn from both detection and grounding data to improve both tasks and bootstrap a good grounding model; 2) GLIP can leverage massive image-text pairs by generating grounding boxes in a self-training fashion, making the learned representation semantic-rich.

Ranked #1 on 2D Object Detection on RF100

Described Object Detection Few-Shot Object Detection +1

1,980

Paper
Code

Track without Appearance: Learn Box and Tracklet Embedding with Local and Global Motion Patterns for Vehicle Tracking

1 code implementation • ICCV 2021 • Gaoang Wang, Renshu Gu, Zuozhu Liu, Weijie Hu, Mingli Song, Jenq-Neng Hwang

In this paper, we try to explore the significance of motion patterns for vehicle tracking without appearance information.

Multi-Object Tracking

Paper
Code

ACE: Ally Complementary Experts for Solving Long-Tailed Recognition in One-Shot

no code implementations • ICCV 2021 • Jiarui Cai, Yizhou Wang, Jenq-Neng Hwang

One-stage long-tailed recognition methods improve the overall performance in a "seesaw" manner, i. e., either sacrifice the head's accuracy for better tail classification or elevate the head's accuracy even higher but ignore the tail.

Ranked #17 on Long-tail Learning on CIFAR-10-LT (ρ=100)

Long-tail Learning

Paper
Add Code

Is Object Detection Necessary for Human-Object Interaction Recognition?

no code implementations • arXiv 2021 • Ying Jin, Yinpeng Chen, Lijuan Wang, JianFeng Wang, Pei Yu, Zicheng Liu, Jenq-Neng Hwang

This paper revisits human-object interaction (HOI) recognition at image level without using supervisions of object location and human pose.

Human-Object Interaction Detection Object +2

Paper
Add Code

Deep Open Snake Tracker for Vessel Tracing

no code implementations • 19 Jul 2021 • Li Chen, Wenjin Liu, Niranjan Balu, Mahmud Mossa-Basha, Thomas S. Hatsukami, Jenq-Neng Hwang, Chun Yuan

Vessel tracing by modeling vascular structures in 3D medical images with centerlines and radii can provide useful information for vascular health.

Paper
Add Code

NTIRE 2021 Multi-modal Aerial View Object Classification Challenge

no code implementations • 2 Jul 2021 • Jerrick Liu, Nathan Inkawhich, Oliver Nina, Radu Timofte, Sahil Jain, Bob Lee, Yuru Duan, Wei Wei, Lei Zhang, Songzheng Xu, Yuxuan Sun, Jiaqi Tang, Mengru Ma, Gongzhe Li, Xueli Geng, Huanqia Cai, Chengxue Cai, Sol Cummings, Casian Miron, Alexandru Pasarica, Cheng-Yen Yang, Hung-Min Hsu, Jiarui Cai, Jie Mei, Chia-Ying Yeh, Jenq-Neng Hwang, Michael Xin, Zhongkai Shangguan, Zihe Zheng, Xu Yifei, Lehan Yang, Kele Xu, Min Feng

In this paper, we introduce the first Challenge on Multi-modal Aerial View Object Classification (MAVOC) in conjunction with the NTIRE 2021 workshop at CVPR.

Classification Object

Paper
Add Code

Rethinking of Radar's Role: A Camera-Radar Dataset and Systematic Annotator via Coordinate Alignment

no code implementations • 11 May 2021 • Yizhou Wang, Gaoang Wang, Hung-Min Hsu, Hui Liu, Jenq-Neng Hwang

Radar has long been a common sensor on autonomous vehicles for obstacle ranging and speed estimation.

Autonomous Vehicles object-detection +2

Paper
Add Code

Split and Connect: A Universal Tracklet Booster for Multi-Object Tracking

no code implementations • 6 May 2021 • Gaoang Wang, Yizhou Wang, Renshu Gu, Weijie Hu, Jenq-Neng Hwang

To address such common challenges in most of the existing trackers, in this paper, a tracklet booster algorithm is proposed, which can be built upon any other tracker.

Multi-Object Tracking

Paper
Add Code

Multi-Target Multi-Camera Tracking of Vehicles using Metadata-Aided Re-ID and Trajectory-Based Camera Link Model

no code implementations • 3 May 2021 • Hung-Min Hsu, Jiarui Cai, Yizhou Wang, Jenq-Neng Hwang, Kwang-Ju Kim

In this paper, we propose a novel framework for multi-target multi-camera tracking (MTMCT) of vehicles based on metadata-aided re-identification (MA-ReID) and the trajectory-based camera link model (TCLM).

Clustering

Paper
Add Code

RODNet: A Real-Time Radar Object Detection Network Cross-Supervised by Camera-Radar Fused Object 3D Localization

1 code implementation • 9 Feb 2021 • Yizhou Wang, Zhongyu Jiang, Yudong Li, Jenq-Neng Hwang, Guanbin Xing, Hui Liu

Finally, we propose a method to evaluate the object detection performance of the RODNet.

Object Object Detection +1

219

Paper
Code

Absolute 3D Pose Estimation and Length Measurement of Severely Deformed Fish from Monocular Videos in Longline Fishing

no code implementations • 9 Feb 2021 • Jie Mei, Jenq-Neng Hwang, Suzanne Romain, Craig Rose, Braden Moore, Kelsey Magrane

Finally, with a closed-form solution, the relative 3D fish pose can help locate absolute 3D keypoints, resulting in the frame-based absolute fish length measurement, which is further refined based on the statistical temporal inference for the optimal fish length measurement from the video clip.

3D Pose Estimation

Paper
Add Code

Video-based Hierarchical Species Classification for Longline Fishing Monitoring

no code implementations • 6 Feb 2021 • Jie Mei, Jenq-Neng Hwang, Suzanne Romain, Craig Rose, Braden Moore, Kelsey Magrane

However, with a known non-overlapping hierarchical data structure provided by fisheries scientists, our method enforces the hierarchical data structure and introduces an efficient training and inference strategy for video-based fisheries data.

Classification General Classification

Paper
Add Code

Exploring Severe Occlusion: Multi-Person 3D Pose Estimation with Gated Convolution

no code implementations • 31 Oct 2020 • Renshu Gu, Gaoang Wang, Jenq-Neng Hwang

Videos that contain multiple potentially occluded people captured from freely moving monocular cameras are very common in real-world scenarios, while 3D HPE for such scenarios is quite challenging, partially because there is a lack of such data with accurate 3D ground truth labels in existing datasets.

3D Human Pose Estimation 3D Pose Estimation

Paper
Add Code

Traffic-Aware Multi-Camera Tracking of Vehicles Based on ReID and Camera Link Model

no code implementations • 22 Aug 2020 • Hung-Min Hsu, Yizhou Wang, Jenq-Neng Hwang

In this paper, we propose an effective and reliable MTMCT framework for vehicles, which consists of a traffic-aware single camera tracking (TSCT) algorithm, a trajectory-based camera link model (CLM) for vehicle re-identification (ReID), and a hierarchical clustering algorithm to obtain the cross camera vehicle trajectories.

Clustering Vehicle Re-Identification

Paper
Add Code

Automated Intracranial Artery Labeling using a Graph Neural Network and Hierarchical Refinement

1 code implementation • 11 Jul 2020 • Li Chen, Thomas Hatsukami, Jenq-Neng Hwang, Chun Yuan

Automatically labeling intracranial arteries (ICA) with their anatomical names is beneficial for feature extraction and detailed analysis of intracranial vascular structures.

Paper
Code

IA-MOT: Instance-Aware Multi-Object Tracking with Motion Consistency

no code implementations • 24 Jun 2020 • Jiarui Cai, Yizhou Wang, Haotian Zhang, Hung-Min Hsu, Chengqian Ma, Jenq-Neng Hwang

Meanwhile, the spatial attention, which focuses on the foreground within the bounding boxes, is generated from the given instance masks and applied to the extracted embedding features.

Multi-Object Tracking Multiple Object Tracking +1

Paper
Add Code

OVC-Net: Object-Oriented Video Captioning with Temporal Graph and Detail Enhancement

no code implementations • 8 Mar 2020 • Fangyi Zhu, Jenq-Neng Hwang, Zhanyu Ma, Guang Chen, Jun Guo

Thereafter, we construct a new dataset, providing consistent object-sentence pairs, to facilitate effective cross-modal learning.

Object Sentence +1

Paper
Add Code

RODNet: Radar Object Detection Using Cross-Modal Supervision

1 code implementation • 3 Mar 2020 • Yizhou Wang, Zhongyu Jiang, Xiangyu Gao, Jenq-Neng Hwang, Guanbin Xing, Hui Liu

Radar is usually more robust than the camera in severe driving scenarios, e. g., weak/strong lighting and bad weather.

Autonomous Driving Object +3

219

Paper
Code

Eye in the Sky: Drone-Based Object Tracking and 3D Localization

no code implementations • 18 Oct 2019 • Haotian Zhang, Gaoang Wang, Zhichao Lei, Jenq-Neng Hwang

Drones, or general UAVs, equipped with a single camera have been widely deployed to a broad range of applications, such as aerial photography, fast goods delivery and most importantly, surveillance.

drone-based object tracking Multi-Object Tracking +3

Paper
Add Code

A New Technique of Camera Calibration: A Geometric Approach Based on Principal Lines

no code implementations • 18 Aug 2019 • Jen-Hui Chuang, Chih-Hui Ho, Ardian Umam, Hsin-Yi Chen, Mu-Tien Lu, Jenq-Neng Hwang, Tai-An Chen

Camera calibration is a crucial prerequisite in many applications of computer vision.

Camera Calibration

Paper
Add Code

CityFlow: A City-Scale Benchmark for Multi-Target Multi-Camera Vehicle Tracking and Re-Identification

no code implementations • CVPR 2019 • Zheng Tang, Milind Naphade, Ming-Yu Liu, Xiaodong Yang, Stan Birchfield, Shuo Wang, Ratnesh Kumar, David Anastasiu, Jenq-Neng Hwang

Urban traffic optimization using traffic cameras as sensors is driving the need to advance state-of-the-art multi-target multi-camera (MTMC) tracking.

object-detection Object Detection +1

Paper
Add Code

MOANA: An Online Learned Adaptive Appearance Model for Robust Multiple Object Tracking in 3D

no code implementations • 9 Jan 2019 • Zheng Tang, Jenq-Neng Hwang

Multiple object tracking has been a challenging field, mainly due to noisy detection sets and identity switch caused by occlusion and similar appearance among nearby targets.

Multiple Object Tracking

Paper
Add Code

Exploit the Connectivity: Multi-Object Tracking with TrackletNet

1 code implementation • 18 Nov 2018 • Gaoang Wang, Yizhou Wang, Haotian Zhang, Renshu Gu, Jenq-Neng Hwang

Multi-object tracking (MOT) is an important and practical task related to both surveillance systems and moving camera applications, such as autonomous driving and robotic vision.

Ranked #19 on Multi-Object Tracking on MOT16

Autonomous Driving Multi-Object Tracking +1

Paper
Code

Y-net: 3D intracranial artery segmentation using a convolutional autoencoder

no code implementations • 19 Dec 2017 • Li Chen, Yanjun Xie, Jie Sun, Niranjan Balu, Mahmud Mossa-Basha, Kristi Pimentel, Thomas S. Hatsukami, Jenq-Neng Hwang, Chun Yuan

Automated segmentation of intracranial arteries on magnetic resonance angiography (MRA) allows for quantification of cerebrovascular features, which provides tools for understanding aging and pathophysiological adaptations of the cerebrovascular system.

Binary Classification General Classification +1

Paper
Add Code

Virtual Blood Vessels in Complex Background using Stereo X-ray Images

no code implementations • 22 Sep 2017 • Qiuyu Chen, Ryoma Bise, Lin Gu, Yinqiang Zheng, Imari Sato, Jenq-Neng Hwang, Nobuaki Imanishi, Sadakazu Aiso

We propose a fully automatic system to reconstruct and visualize 3D blood vessels in Augmented Reality (AR) system from stereo X-ray images with bones and body fat.

Stereo Matching Stereo Matching Hand

Paper
Add Code

Multiple-Kernel Based Vehicle Tracking Using 3D Deformable Model and Camera Self-Calibration

no code implementations • 22 Aug 2017 • Zheng Tang, Gaoang Wang, Tao Liu, Young-Gun Lee, Adwin Jahn, Xu Liu, Xiaodong He, Jenq-Neng Hwang

In this challenge, we propose a model-based vehicle localization method, which builds a kernel at each patch of the 3D deformable vehicle model and associates them with constraints in 3D space.

Ensemble Learning object-detection +1

Paper
Add Code

DesnowNet: Context-Aware Deep Network for Snow Removal

no code implementations • 15 Aug 2017 • Yun-Fu Liu, Da-Wei Jaw, Shih-Chia Huang, Jenq-Neng Hwang

Existing learning-based atmospheric particle-removal approaches such as those used for rainy and hazy images are designed with strong assumptions regarding spatial frequency, trajectory, and translucency.

Semantic Segmentation Snow Removal

Paper
Add Code

Underwater Fish Tracking for Moving Cameras based on Deformable Multiple Kernels

no code implementations • 5 Mar 2016 • Meng-Che Chuang, Jenq-Neng Hwang, Jian-Hui Ye, Shih-Chia Huang, Kresimir Williams

In this paper, a novel tracking algorithm based on the deformable multiple kernels (DMK) is proposed to address these challenges.

Paper
Add Code

A Feature Learning and Object Recognition Framework for Underwater Fish Images

no code implementations • 5 Mar 2016 • Meng-Che Chuang, Jenq-Neng Hwang, Kresimir Williams

Toward this end, we propose an underwater fish recognition framework that consists of a fully unsupervised feature learning technique and an error-resilient classifier.

Clustering Object Recognition

Paper
Add Code

Tracking Live Fish from Low-Contrast and Low-Frame-Rate Stereo Videos

no code implementations • 15 Apr 2015 • Meng-Che Chuang, Jenq-Neng Hwang, Kresimir Williams, Richard Towler

Non-extractive fish abundance estimation with the aid of visual analysis has drawn increasing attention.

Segmentation Stereo Matching +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.