Search Results for author: Li Xu

Found 47 papers, 9 papers with code

Beyond Alignment: Blind Video Face Restoration via Parsing-Guided Temporal-Coherent Transformer

no code implementations • 21 Apr 2024 • Kepeng Xu, Li Xu, Gang He, Wenxin Yu, Yunsong Li

Multiple complex degradations are coupled in low-quality video faces in the real world.

Paper
Add Code

6D-Diff: A Keypoint Diffusion Framework for 6D Object Pose Estimation

no code implementations • 29 Dec 2023 • Li Xu, Haoxuan Qu, Yujun Cai, Jun Liu

Estimating the 6D object pose from a single RGB image often involves noise and indeterminacy due to challenges such as occlusions and cluttered backgrounds.

6D Pose Estimation using RGB Denoising +1

Paper
Add Code

Trustworthy Large Models in Vision: A Survey

no code implementations • 16 Nov 2023 • Ziyan Guo, Li Xu, Jun Liu

The rapid progress of Large Models (LMs) has recently revolutionized various fields of deep learning with remarkable grades, ranging from Natural Language Processing (NLP) to Computer Vision (CV).

Paper
Add Code

Deep Neural Network Identification of Limnonectes Species and New Class Detection Using Image Data

no code implementations • 15 Nov 2023 • Li Xu, Yili Hong, Eric P. Smith, David S. McLeod, Xinwei Deng, Laura J. Freeman

We demonstrate that deep neural networks can successfully automate the classification of an image into a known species group for which it has been trained.

Out of Distribution (OOD) Detection

Paper
Add Code

Bridged-GNN: Knowledge Bridge Learning for Effective Knowledge Transfer

no code implementations • 18 Aug 2023 • Wendong Bi, Xueqi Cheng, Bingbing Xu, Xiaoqian Sun, Li Xu, HuaWei Shen

Transfer learning has been a feasible way to transfer knowledge from high-quality external data of source domains to limited data of target domains, which follows a domain-level knowledge transfer to learn a shared posterior distribution.

Retrieval Transfer Learning

Paper
Add Code

Towards Robust SDRTV-to-HDRTV via Dual Inverse Degradation Network

no code implementations • 7 Jul 2023 • Kepeng Xu, Li Xu, Gang He, Wenxin Yu, Yunsong Li

In this study, we address the emerging necessity of converting Standard Dynamic Range Television (SDRTV) content into High Dynamic Range Television (HDRTV) in light of the limited number of native HDRTV content.

inverse tone mapping Inverse-Tone-Mapping +2

Paper
Add Code

Multi-modal Pre-training for Medical Vision-language Understanding and Generation: An Empirical Study with A New Benchmark

1 code implementation • 10 Jun 2023 • Li Xu, Bo Liu, Ameer Hamza Khan, Lu Fan, Xiao-Ming Wu

With the availability of large-scale, comprehensive, and general-purpose vision-language (VL) datasets such as MSCOCO, vision-language pre-training (VLP) has become an active area of research and proven to be effective for various VL tasks such as visual-question answering.

Medical Report Generation Question Answering +3

Paper
Code

Meta Compositional Referring Expression Segmentation

no code implementations • CVPR 2023 • Li Xu, Mark He Huang, Xindi Shang, Zehuan Yuan, Ying Sun, Jun Liu

Then, following a novel meta optimization scheme to optimize the model to obtain good testing performance on the virtual testing sets after training on the virtual training set, our framework can effectively drive the model to better capture semantics and visual representations of individual concepts, and thus obtain robust generalization performance even when handling novel compositions.

Meta-Learning Referring Expression +2

Paper
Add Code

Predicting the Silent Majority on Graphs: Knowledge Transferable Graph Neural Network

1 code implementation • 2 Feb 2023 • Wendong Bi, Bingbing Xu, Xiaoqian Sun, Li Xu, HuaWei Shen, Xueqi Cheng

To combat the above challenges, we propose Knowledge Transferable Graph Neural Network (KT-GNN), which models distribution shifts during message passing and representation learning by transferring knowledge from vocal nodes to silent nodes.

Representation Learning

Paper
Code

SDRTV-to-HDRTV Conversion via Spatial-Temporal Feature Fusion

no code implementations • 4 Nov 2022 • Kepeng Xu, Li Xu, Gang He, Chang Wu, Zijia Ma, Ming Sun, Yu-Wing Tai

To evaluate the performance of the proposed method, we construct a corresponding multi-frame dataset using HDR video of the HDR10 standard to conduct a comprehensive evaluation of different methods.

Paper
Add Code

Individualized Conditioning and Negative Distances for Speaker Separation

no code implementations • 12 Oct 2022 • Tao Sun, Nidal Abuhajar, Shuyu Gong, Zhewei Wang, Charles D. Smith, Xianhui Wang, Li Xu, Jundong Liu

Speaker separation aims to extract multiple voices from a mixed signal.

Speaker Separation

Paper
Add Code

Heatmap Distribution Matching for Human Pose Estimation

no code implementations • 3 Oct 2022 • Haoxuan Qu, Li Xu, Yujun Cai, Lin Geng Foo, Jun Liu

In this paper, we show that optimizing the heatmap prediction in such a way, the model performance of body joint localization, which is the intrinsic objective of this task, may not be consistently improved during the optimization process of the heatmap prediction.

2D Human Pose Estimation Pose Estimation

Paper
Add Code

Global Priors Guided Modulation Network for Joint Super-Resolution and Inverse Tone-Mapping

no code implementations • 14 Aug 2022 • Gang He, Shaoyi Long, Li Xu, Chang Wu, Jinjia Zhou, Ming Sun, Xing Wen, Yurong Dai

Joint super-resolution and inverse tone-mapping (SR-ITM) aims to enhance the visual quality of videos that have quality deficiencies in resolution and dynamic range.

4k inverse tone mapping +3

Paper
Add Code

Meta Spatio-Temporal Debiasing for Video Scene Graph Generation

no code implementations • 23 Jul 2022 • Li Xu, Haoxuan Qu, Jason Kuen, Jiuxiang Gu, Jun Liu

Video scene graph generation (VidSGG) aims to parse the video content into scene graphs, which involves modeling the spatio-temporal contextual information in the video.

Graph Generation Meta-Learning +2

Paper
Add Code

NTIRE 2022 Challenge on High Dynamic Range Imaging: Methods and Results

no code implementations • 25 May 2022 • Eduardo Pérez-Pellitero, Sibi Catley-Chandar, Richard Shaw, Aleš Leonardis, Radu Timofte, Zexin Zhang, Cen Liu, Yunbo Peng, Yue Lin, Gaocheng Yu, Jin Zhang, Zhe Ma, Hongbin Wang, Xiangyu Chen, Xintao Wang, Haiwei Wu, Lin Liu, Chao Dong, Jiantao Zhou, Qingsen Yan, Song Zhang, Weiye Chen, Yuhang Liu, Zhen Zhang, Yanning Zhang, Javen Qinfeng Shi, Dong Gong, Dan Zhu, Mengdi Sun, Guannan Chen, Yang Hu, Haowei Li, Baozhu Zou, Zhen Liu, Wenjie Lin, Ting Jiang, Chengzhi Jiang, Xinpeng Li, Mingyan Han, Haoqiang Fan, Jian Sun, Shuaicheng Liu, Juan Marín-Vega, Michael Sloth, Peter Schneider-Kamp, Richard Röttger, Chunyang Li, Long Bao, Gang He, Ziyao Xu, Li Xu, Gen Zhan, Ming Sun, Xing Wen, Junlin Li, Shuang Feng, Fei Lei, Rui Liu, Junxiang Ruan, Tianhong Dai, Wei Li, Zhan Lu, Hengyan Liu, Peian Huang, Guangyu Ren, Yonglin Luo, Chang Liu, Qiang Tu, Fangya Li, Ruipeng Gang, Chenghua Li, Jinjing Li, Sai Ma, Chenming Liu, Yizhen Cao, Steven Tel, Barthelemy Heyrman, Dominique Ginhac, Chul Lee, Gahyeon Kim, Seonghyun Park, An Gia Vien, Truong Thanh Nhat Mai, Howoon Yoon, Tu Vo, Alexander Holston, Sheir Zaheer, Chan Y. Park

The challenge is composed of two tracks with an emphasis on fidelity and complexity constraints: In Track 1, participants are asked to optimize objective fidelity scores while imposing a low-complexity constraint (i. e. solutions can not exceed a given number of operations).

Image Restoration Vocal Bursts Intensity Prediction

Paper
Add Code

Transcoded Video Restoration by Temporal Spatial Auxiliary Network

1 code implementation • 15 Dec 2021 • Li Xu, Gang He, Jinjia Zhou, Jie Lei, Weiying Xie, Yunsong Li, Yu-Wing Tai

In most video platforms, such as Youtube, and TikTok, the played videos usually have undergone multiple video encodings such as hardware encoding by recording devices, software encoding by video editing apps, and single/multiple video transcoding by video application servers.

Video Editing Video Restoration

Paper
Code

Statistical Perspectives on Reliability of Artificial Intelligence Systems

no code implementations • 9 Nov 2021 • Yili Hong, Jiayi Lian, Li Xu, Jie Min, Yueyao Wang, Laura J. Freeman, Xinwei Deng

We also describe recent developments in modeling and analysis of AI reliability and outline statistical research challenges in this area, including out-of-distribution detection, the effect of the training set, adversarial attacks, model accuracy, and uncertainty quantification, and discuss how those topics can be related to AI reliability, with illustrative examples.

Out-of-Distribution Detection Uncertainty Quantification

Paper
Add Code

Recent Advances of Continual Learning in Computer Vision: An Overview

no code implementations • 23 Sep 2021 • Haoxuan Qu, Hossein Rahmani, Li Xu, Bryan Williams, Jun Liu

In contrast to batch learning where all training data is available at once, continual learning represents a family of methods that accumulate knowledge and learn continuously with data available in sequential order.

Continual Learning Knowledge Distillation

Paper
Add Code

The Multi-Modal Video Reasoning and Analyzing Competition

no code implementations • 18 Aug 2021 • Haoran Peng, He Huang, Li Xu, Tianjiao Li, Jun Liu, Hossein Rahmani, Qiuhong Ke, Zhicheng Guo, Cong Wu, Rongchang Li, Mang Ye, Jiahao Wang, Jiaxu Zhang, Yuanzhong Liu, Tao He, Fuwei Zhang, Xianbin Liu, Tao Lin

In this paper, we introduce the Multi-Modal Video Reasoning and Analyzing Competition (MMVRAC) workshop in conjunction with ICCV 2021.

Action Recognition Person Re-Identification +3

Paper
Add Code

Efficient Two-Step Networks for Temporal Action Segmentation

1 code implementation • Neurocomputing 2021 • Yunheng Li, Zhuben Dong, Kaiyuan Liu, Lin Feng, Lianyu Hu, Jie Zhu, Li Xu, YuHan Wang, Shenglan Liu

Due to boundary ambiguity and over-segmentation issues, identifying all the frames in long untrimmed videos is still challenging.

Ranked #12 on Action Segmentation on GTEA

Action Segmentation Segmentation +1

Paper
Code

SUTD-TrafficQA: A Question Answering Benchmark and an Efficient Network for Video Reasoning over Traffic Events

3 code implementations • CVPR 2021 • Li Xu, He Huang, Jun Liu

In this paper, we create a novel dataset, SUTD-TrafficQA (Traffic Question Answering), which takes the form of video QA based on the collected 10, 080 in-the-wild videos and annotated 62, 535 QA pairs, for benchmarking the cognitive capability of causal inference and event understanding models in complex traffic scenarios.

Ranked #2 on Video Question Answering on SUTD-TrafficQA

Autonomous Vehicles Benchmarking +4

Paper
Code

Unifying deterministic and stochastic ecological dynamics via a landscape-flux approach

no code implementations • 15 Mar 2021 • Li Xu, Denis Patterson, Ann Carla Staver, Simon Asher Levin, Jin Wang

We develop a landscape-flux framework to investigate observed frequency distributions of vegetation and the stability of these ecological systems under fluctuations.

Paper
Add Code

Enhancement of Superconductivity Linked with Linear-in-Temperature/Field Resistivity in Ion-Gated FeSe Films

no code implementations • 11 Mar 2021 • Xingyu Jiang, Mingyang Qin, Xinjian Wei, Zhongpei Feng, Jiezun Ke, Haipeng Zhu, Fucong Chen, Liping Zhang, Li Xu, Xu Zhang, Ruozhou Zhang, Zhongxu Wei, Peiyu Xiong, Qimei Liang, Chuanying Xi, Zhaosheng Wang, Jie Yuan, Beiyi Zhu, Kun Jiang, Ming Yang, Junfeng Wang, Jiangping Hu, Tao Xiang, Brigitte Leridon, Rong Yu, Qihong Chen, Kui Jin, Zhongxian Zhao

Iron selenide (FeSe) - the structurally simplest iron-based superconductor, has attracted tremendous interest in the past years.

Superconductivity

Paper
Add Code

SLAKE: A Semantically-Labeled Knowledge-Enhanced Dataset for Medical Visual Question Answering

2 code implementations • 18 Feb 2021 • Bo Liu, Li-Ming Zhan, Li Xu, Lin Ma, Yan Yang, Xiao-Ming Wu

We show that SLAKE can be used to facilitate the development and evaluation of Med-VQA systems.

Medical Visual Question Answering Question Answering +1

Paper
Code

Modelling Universal Order Book Dynamics in Bitcoin Market

no code implementations • 15 Jan 2021 • Fabin Shi, Nathan Aden, Shengda Huang, Neil Johnson, Xiaoqian Sun, Jinhua Gao, Li Xu, HuaWei Shen, Xueqi Cheng, Chaoming Song

Understanding the emergence of universal features such as the stylized facts in markets is a long-standing challenge that has drawn much attention from economists and physicists.

Paper
Add Code

Learning to Benchmark: Determining Best Achievable Misclassification Error from Training Data

2 code implementations • 16 Sep 2019 • Morteza Noshad, Li Xu, Alfred Hero

In this problem the objective is to establish statistically consistent estimates of the Bayes misclassification error rate without having to learn a Bayes-optimal classifier.

Paper
Code

Effective Domain Knowledge Transfer with Soft Fine-tuning

no code implementations • 5 Sep 2019 • Zhichen Zhao, Bo-Wen Zhang, Yuning Jiang, Li Xu, Lei LI, Wei-Ying Ma

However, the datasets from source domain are simply discarded in the fine-tuning process.

Transfer Learning

Paper
Add Code

Multi-Antenna Channel Interpolation via Tucker Decomposed Extreme Learning Machine

no code implementations • 26 Dec 2018 • Han Zhang, Bo Ai, Wenjun Xu, Li Xu, Shuguang Cui

Channel interpolation is an essential technique for providing high-accuracy estimation of the channel state information (CSI) for wireless systems design where the frequency-space structural correlations of multi-antenna channel are typically hidden in matrix or tensor forms.

Paper
Add Code

Structure-Preserving Image Super-resolution via Contextualized Multi-task Learning

no code implementations • 26 Jul 2017 • Yukai Shi, Keze Wang, Chongyu Chen, Li Xu, Liang Lin

Single image super resolution (SR), which refers to reconstruct a higher-resolution (HR) image from the observed low-resolution (LR) image, has received substantial attention due to its tremendous application potentials.

Computational Efficiency Image Restoration +2

Paper
Add Code

Accurate Single Stage Detector Using Recurrent Rolling Convolution

2 code implementations • CVPR 2017 • Jimmy Ren, Xiaohao Chen, Jianbo Liu, Wenxiu Sun, Jiahao Pang, Qiong Yan, Yu-Wing Tai, Li Xu

In this paper, we proposed a novel single stage end-to-end trainable object detection network to overcome this limitation.

3D Object Detection Object +2

362

Paper
Code

Local- and Holistic- Structure Preserving Image Super Resolution via Deep Joint Component Learning

no code implementations • 25 Jul 2016 • Yukai Shi, Keze Wang, Li Xu, Liang Lin

Recently, machine learning based single image super resolution (SR) approaches focus on jointly learning representations for high-resolution (HR) and low-resolution (LR) image patch pairs to improve the quality of the super-resolved images.

Image Super-Resolution Representation Learning

Paper
Add Code

Look, Listen and Learn - A Multimodal LSTM for Speaker Identification

no code implementations • 13 Feb 2016 • Jimmy Ren, Yongtao Hu, Yu-Wing Tai, Chuan Wang, Li Xu, Wenxiu Sun, Qiong Yan

This task not only requires collective perception over both visual and auditory signals, the robustness to handle severe quality degradations and unconstrained content variations are also indispensable.

Speaker Identification

Paper
Add Code

Shepard Convolutional Neural Networks

1 code implementation • NeurIPS 2015 • Jimmy SJ. Ren, Li Xu, Qiong Yan, Wenxiu Sun

In this paper, we draw on Shepard interpolation and design Shepard Convolutional Neural Networks (ShCNN) which efficiently realizes end-to-end trainable TVI operators in the network.

Image Inpainting Super-Resolution +1

131

Paper
Code

Mutual-Structure for Joint Filtering

no code implementations • ICCV 2015 • Xiaoyong Shen, Chao Zhou, Li Xu, Jiaya Jia

Previous joint/guided filters directly transfer the structural information in the reference image to the target one.

Depth Completion Image Enhancement +3

Paper
Add Code

Deep Multimodal Speaker Naming

no code implementations • 17 Jul 2015 • Yongtao Hu, Jimmy Ren, Jingwen Dai, Chang Yuan, Li Xu, Wenping Wang

Automatic speaker naming is the problem of localizing as well as identifying each speaking character in a TV/movie/live show video.

Face Alignment

Paper
Add Code

Just Noticeable Defocus Blur Detection and Estimation

no code implementations • CVPR 2015 • Jianping Shi, Li Xu, Jiaya Jia

We tackle a fundamental problem to detect and estimate just noticeable blur (JNB) caused by defocus that spans a small number of pixels in images.

Defocus Blur Detection

Paper
Add Code

Handling Motion Blur in Multi-Frame Super-Resolution

no code implementations • CVPR 2015 • Ziyang Ma, Renjie Liao, Xin Tao, Li Xu, Jiaya Jia, Enhua Wu

Ubiquitous motion blur easily fails multi-frame super-resolution (MFSR).

Image Reconstruction Multi-Frame Super-Resolution

Paper
Add Code

On Vectorization of Deep Convolutional Neural Networks for Vision Tasks

no code implementations • 29 Jan 2015 • Jimmy SJ. Ren, Li Xu

We recently have witnessed many ground-breaking results in machine learning and computer vision, generated by using deep convolutional neural networks (CNN).

Paper
Add Code

Deep Convolutional Neural Network for Image Deconvolution

no code implementations • NeurIPS 2014 • Li Xu, Jimmy SJ. Ren, Ce Liu, Jiaya Jia

Many fundamental image-related problems involve deconvolution operators.

Ranked #1 on Image Compression on FER2013

Image Compression Image Deconvolution

Paper
Add Code

Hierarchical Saliency Detection on Extended CSSD

no code implementations • 11 Aug 2014 • Jianping Shi, Qiong Yan, Li Xu, Jiaya Jia

Complex structures commonly exist in natural images.

Saliency Detection

Paper
Add Code

An Evasion and Counter-Evasion Study in Malicious Websites Detection

no code implementations • 8 Aug 2014 • Li Xu, Zhenxin Zhan, Shouhuai Xu, Keyin Ye

Within this framework, we show that an adaptive attacker can make malicious websites evade powerful detection models, but proactive training can be an effective counter-evasion defense mechanism.

Paper
Add Code

Discriminative Blur Detection Features

no code implementations • CVPR 2014 • Jianping Shi, Li Xu, Jiaya Jia

Ubiquitous image blur brings out a practically important question what are effective features to differentiate between blurred and unblurred image regions.

Deblurring

Paper
Add Code

Joint Depth Estimation and Camera Shake Removal from Single Blurry Image

no code implementations • CVPR 2014 • Zhe Hu, Li Xu, Ming-Hsuan Yang

The non-uniform blur effect is not only caused by the camera motion, but also the depth variation of the scene.

Deblurring Depth Estimation +1

Paper
Add Code

100+ Times Faster Weighted Median Filter (WMF)

no code implementations • CVPR 2014 • Qi Zhang, Li Xu, Jiaya Jia

Weighted median, in the form of either solver or filter, has been employed in a wide range of computer vision solutions for its beneficial properties in sparsity representation.

2D Semantic Segmentation task 1 (8 classes) Optical Flow Estimation +2

Paper
Add Code

Dense Scattering Layer Removal

no code implementations • 13 Oct 2013 • Qiong Yan, Li Xu, Jiaya Jia

We propose a new model, together with advanced optimization, to separate a thick scattering media layer from a single natural image.

Paper
Add Code

Unnatural L0 Sparse Representation for Natural Image Deblurring

no code implementations • CVPR 2013 • Li Xu, Shicheng Zheng, Jiaya Jia

We show in this paper that the success of previous maximum a posterior (MAP) based blur removal methods partly stems from their respective intermediate steps, which implicitly or explicitly create an unnatural representation containing salient image structures.

Ranked #13 on Deblurring on RealBlur-R (trained on GoPro) (SSIM (sRGB) metric)

Deblurring Image Deblurring

Paper
Add Code

Hierarchical Saliency Detection

no code implementations • CVPR 2013 • Qiong Yan, Li Xu, Jianping Shi, Jiaya Jia

When dealing with objects with complex structures, saliency detection confronts a critical problem namely that detection accuracy could be adversely affected if salient foreground or background in an image contains small-scale high-contrast patterns.

Saliency Detection

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.