no code implementations • CCL 2022 • Shuang Nie, Zheng Ye, Jun Qin, Jing Liu
“目前常见的机器阅读理解数据增强方法如回译, 单独对文章或者问题进行数据增强, 没有考虑文章、问题和选项三元组之间的联系。因此, 本文探索了一种利用三元组联系进行文章句子筛选的数据增强方法, 通过比较文章与问题以及选项的相似度, 选取文章中与二者联系紧密的句子。同时为了使不同选项的三元组区别增大, 我们选用了正则化Dropout的策略。实验结果表明, 在RACE数据集上的准确率可提高3. 8%。”
1 code implementation • Findings (ACL) 2022 • Le Qi, Shangwen Lv, Hongyu Li, Jing Liu, Yu Zhang, Qiaoqiao She, Hua Wu, Haifeng Wang, Ting Liu
Open-domain question answering has been used in a wide range of applications, such as web search and enterprise search, which usually takes clean texts extracted from various formats of documents (e. g., web pages, PDFs, or Word documents) as the information source.
no code implementations • ECCV 2020 • Yujun Cai, Lin Huang, Yiwei Wang, Tat-Jen Cham, Jianfei Cai, Junsong Yuan, Jun Liu, Xu Yang, Yiheng Zhu, Xiaohui Shen, Ding Liu, Jing Liu, Nadia Magnenat Thalmann
Last, in order to incorporate a general motion space for high-quality prediction, we build a memory-based dictionary, which aims to preserve the global motion patterns in training data to guide the predictions.
1 code implementation • ECCV 2020 • Zheng Xie, Zhiquan Wen, Jing Liu, Zhi-Qiang Liu, Xixian Wu, Mingkui Tan
Specifically, we propose a method named deep transferring quantization (DTQ) to effectively exploit the knowledge in a pre-trained full-precision model.
no code implementations • 6 May 2024 • XiaoBin Li, Kai Wu, Yujian Betterest Li, XiaoYu Zhang, Handing Wang, Jing Liu
Pretrained Optimization Models (POMs) leverage knowledge gained from optimizing various tasks, providing efficient solutions for new optimization challenges through direct usage or fine-tuning.
no code implementations • 22 Apr 2024 • Dongze Hao, Qunbo Wang, Longteng Guo, Jie Jiang, Jing Liu
Knowledge-based Visual Question Answering (VQA) requires models to incorporate external knowledge to respond to questions about visual content.
no code implementations • 12 Apr 2024 • Yichen Yan, Xingjian He, Sihan Chen, Jing Liu
In this paper, we introduce CRFormer, a model that iteratively calibrates multi-modal features in the transformer decoder.
no code implementations • 12 Apr 2024 • Zhiwei Yang, Jing Liu, Peng Wu
Further, we propose a learnable text prompt mechanism with the assist of a normality visual prompt to further improve the matching accuracy of video event description text and video frames.
no code implementations • 3 Apr 2024 • Paiheng Xu, Jing Liu, Nathan Jones, Julie Cohen, Wei Ai
Assessing instruction quality is a fundamental component of any improvement efforts in the education system.
no code implementations • 27 Mar 2024 • Jiahao Luo, Jing Liu, James Davis
Our method is designed to simultaneously deliver both high-quality novel view rendering and accurate 3D mesh reconstructions.
no code implementations • 20 Mar 2024 • Yanyuan Qiao, Zheng Yu, Longteng Guo, Sihan Chen, Zijia Zhao, Mingzhen Sun, Qi Wu, Jing Liu
The extensive experiments on diverse multimodal benchmarks with competitive performance show the effectiveness of our proposed VL-Mamba and demonstrate the great potential of applying state space models for multimodal learning tasks.
Ranked #68 on Visual Question Answering on MM-Vet
1 code implementation • 20 Mar 2024 • Tongtian Yue, Jie Cheng, Longteng Guo, Xingyuan Dai, Zijia Zhao, Xingjian He, Gang Xiong, Yisheng Lv, Jing Liu
In this paper, we present and delve into the self-consistency capability of LVLMs, a crucial aspect that reflects the models' ability to both generate informative captions for specific objects and subsequently utilize these captions to accurately re-identify the objects in a closed-loop process.
no code implementations • 18 Mar 2024 • Xiangyu Chen, Jing Liu, Ye Wang, Pu, Wang, Matthew Brand, Guanghui Wang, Toshiaki Koike-Akino
Low-rank adaptation (LoRA) and its variants are widely employed in fine-tuning large models, including large language models for natural language processing and diffusion models for computer vision.
no code implementations • 15 Mar 2024 • Dongze Hao, Jian Jia, Longteng Guo, Qunbo Wang, Te Yang, Yan Li, Yanhua Cheng, Bo wang, Quan Chen, Han Li, Jing Liu
We condense the retrieved knowledge passages from two perspectives.
no code implementations • 7 Mar 2024 • Hui Huang, Yingqi Qu, Jing Liu, Muyun Yang, Tiejun Zhao
The proliferation of open-source Large Language Models (LLMs) underscores the pressing need for evaluation methods.
no code implementations • 6 Mar 2024 • Zewei Tian, Min Sun, Alex Liu, Shawon Sarkar, Jing Liu
This paper explores the transformative potential of computer-assisted textual analysis in enhancing instructional quality through in-depth insights from educational artifacts.
no code implementations • 5 Mar 2024 • Hui Huang, Yingqi Qu, Jing Liu, Muyun Yang, Tiejun Zhao
Recently, there has been a growing trend of utilizing Large Language Model (LLM) to evaluate the quality of other LLMs.
no code implementations • 28 Feb 2024 • Bin Cao, Jianhao Yuan, Yexin Liu, Jian Li, Shuyang Sun, Jing Liu, Bo Zhao
To alleviate artifacts and improve quality of synthetic images, we fine-tune Vision-Language Model (VLM) as artifact classifier to automatically identify and classify a wide range of artifacts and provide supervision for further optimizing generative models.
no code implementations • 27 Feb 2024 • Ruiyang Ren, Peng Qiu, Yingqi Qu, Jing Liu, Wayne Xin Zhao, Hua Wu, Ji-Rong Wen, Haifeng Wang
Due to the excellent capacities of large language models (LLMs), it becomes feasible to develop LLM-based agents for reliable user simulation.
1 code implementation • 27 Feb 2024 • Yuhao Wang, Ruiyang Ren, Junyi Li, Wayne Xin Zhao, Jing Liu, Ji-Rong Wen
By combining the improvements in both architecture and training, our proposed REAR can better utilize external knowledge by effectively perceiving the relevance of retrieved documents.
no code implementations • 20 Feb 2024 • Jie Yan, Jing Liu, Yi-Zi Ning, Zhong-Yuan Zhang
In federated clustering, multiple data-holding clients collaboratively group data without exchanging raw data.
1 code implementation • 17 Feb 2024 • Wenxuan Wang, Yisi Zhang, Xingjian He, Yichen Yan, Zijia Zhao, Xinlong Wang, Jing Liu
Previous datasets and methods for classic VG task mainly rely on the prior assumption that the given expression must literally refer to the target object, which greatly impedes the practical deployment of agents in real-world scenarios.
no code implementations • 14 Feb 2024 • Andrew Lowy, Zhuohang Li, Jing Liu, Toshiaki Koike-Akino, Kieran Parsons, Ye Wang
In practical applications, such a worst-case guarantee may be overkill: practical attackers may lack exact knowledge of (nearly all of) the private data, and our data set might be easier to defend, in some sense, than the worst-case data set.
no code implementations • 19 Jan 2024 • Hongyi Wang, Xiuju Du, Jing Liu, Shuyi Ouyang, Yen-Wei Chen, Lanfen Lin
To address this limit, we propose M2ORT, a many-to-one regression Transformer that can accommodate the hierarchical structure of the pathology images through a decoupled multi-scale feature extractor.
no code implementations • 16 Jan 2024 • Yang Feng, Zhaohui Sun, Chengcheng Wang, Xinyi Guo, Junyao Mei, Yueran Qi, Jing Liu, Junyu Zhang, Jixuan Wu, Xuepeng Zhan, Jiezhi Chen
Flash memory has been widely adopted as stand-alone memory and embedded memory due to its robust reliability.
1 code implementation • 12 Jan 2024 • Jie Yan, Jing Liu, Zhong-Yuan Zhang
Benefiting from representation learning, the clustering performance of CCFC even double those of the best baseline methods in some cases.
no code implementations • 2 Jan 2024 • Hongyu Wang, Xiaotao Liu, YiFan Li, Meng Sun, Dian Yuan, Jing Liu
RGBT tracking has been widely used in various fields such as robotics, surveillance processing, and autonomous driving.
Ranked #4 on Rgb-T Tracking on RGBT210
1 code implementation • 18 Dec 2023 • Lanlan Chen, Kai Wu, Jian Lou, Jing Liu
Modeling continuous-time dynamics constitutes a foundational challenge, and uncovering inter-component correlations within complex systems holds promise for enhancing the efficacy of dynamic modeling.
1 code implementation • 13 Dec 2023 • Wenxuan Wang, Tongtian Yue, Yisi Zhang, Longteng Guo, Xingjian He, Xinlong Wang, Jing Liu
To foster future research into fine-grained visual grounding, our benchmark RefCOCOm, the MRES-32M dataset and model UniRES will be publicly available at https://github. com/Rubics-Xuan/MRES
no code implementations • 13 Dec 2023 • Jie Yan, Jing Liu, Zhong-Yuan Zhang
In the E-step, we aim to derive a mixture of Gaussian priors for the subsequent M-step.
1 code implementation • 29 Nov 2023 • Haoyu He, Zizheng Pan, Jing Liu, Jianfei Cai, Bohan Zhuang
In this work, we present a novel framework, Efficient Stitchable Task Adaptation (ESTA), to efficiently produce a palette of fine-tuned models that adhere to diverse resource constraints.
1 code implementation • 29 Nov 2023 • Zijian Chen, Wei Sun, Jun Jia, Fangfang Lu, ZiCheng Zhang, Jing Liu, Ru Huang, Xiongkuo Min, Guangtao Zhai
The quality score of a banding image is generated by pooling the banding detection maps masked by the spatial frequency filters.
1 code implementation • 27 Nov 2023 • Yushi Huang, Ruihao Gong, Jing Liu, Tianlong Chen, Xianglong Liu
Remarkably, our quantization approach, for the first time, achieves model performance nearly on par with the full-precision model under 4-bit weight quantization.
no code implementations • 13 Nov 2023 • Peng Wu, Xuerong Zhou, Guansong Pang, Yujia Sun, Jing Liu, Peng Wang, Yanning Zhang
Particularly, we devise a semantic knowledge injection module to introduce semantic knowledge from large language models for the detection task, and design a novel anomaly synthesis module to generate pseudo unseen anomaly videos with the help of large vision generation models for the classification task.
no code implementations • 3 Nov 2023 • James Boyko, Joseph Cohen, Nathan Fox, Maria Han Veiga, Jennifer I-Hsiu Li, Jing Liu, Bernardo Modenesi, Andreas H. Rauch, Kenneth N. Reid, Soumi Tribedi, Anastasia Visheratina, Xin Xie
In this paper, we describe the capabilities and constraints of Large Language Models (LLMs) within disparate academic disciplines, aiming to delineate their strengths and limitations with precision.
no code implementations • 28 Oct 2023 • Haoran Shen, Yifu Zhang, Wenxuan Wang, Chen Chen, Jing Liu, Shanshan Song, Jiangyun Li
As a pioneering work, a dynamic architecture network for medical volumetric segmentation (i. e. Med-DANet) has achieved a favorable accuracy and efficiency trade-off by dynamically selecting a suitable 2D candidate model from the pre-defined model bank for different slices.
2 code implementations • 12 Oct 2023 • Jing Liu, Ruihao Gong, Xiuying Wei, Zhiwei Dong, Jianfei Cai, Bohan Zhuang
Additionally, an adaptive strategy is designed to autonomously determine the optimal number of sub-channels for channel disassembly.
no code implementations • 12 Oct 2023 • Niklas Smedemark-Margulies, Ye Wang, Toshiaki Koike-Akino, Jing Liu, Kieran Parsons, Yunus Bicer, Deniz Erdogmus
Classification models for electroencephalogram (EEG) data show a large decrease in performance when evaluated on unseen test sub jects.
1 code implementation • 5 Oct 2023 • Yefei He, Jing Liu, Weijia Wu, Hong Zhou, Bohan Zhuang
In this paper, we introduce a data-free and parameter-efficient fine-tuning framework for low-bit diffusion models, dubbed EfficientDM, to achieve QAT-level performance with PTQ-like efficiency.
1 code implementation • NeurIPS 2023 • Mingzhen Sun, Weining Wang, Zihan Qin, Jiahui Sun, Sihan Chen, Jing Liu
Specifically, we propose a video auto-encoder, where a video encoder encodes videos into global features, and a video decoder, built on a diffusion model, decodes the global features and synthesizes video frames in a non-autoregressive manner.
no code implementations • 12 Sep 2023 • Ahmed Adel Attia, Jing Liu, Wei Ai, Dorottya Demszky, Carol Espy-Wilson
Recent advancements in Automatic Speech Recognition (ASR) systems, exemplified by Whisper, have demonstrated the potential of these systems to approach human-level performance given sufficient data.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 11 Sep 2023 • Li Chen, Mengyi Zhao, Yiheng Liu, Mingxu Ding, Yangyang Song, Shizun Wang, Xu Wang, Hao Yang, Jing Liu, Kang Du, Min Zheng
Personalized text-to-image generation has emerged as a powerful and sought-after tool, empowering users to create customized images based on their specific concepts and prompts.
no code implementations • 8 Sep 2023 • Yanrui Du, Sendong Zhao, Yuhan Chen, Rai Bai, Jing Liu, Hua Wu, Haifeng Wang, Bing Qin
To address this issue, it is crucial to analyze and mitigate the influence of superficial clues on STM models.
1 code implementation • 5 Sep 2023 • Kai Wu, Yuanyuan Li, Jing Liu
Inferring networks from observed time series data presents a clear glimpse into the interconnections among nodes.
1 code implementation • 23 Aug 2023 • Yufeng Yin, Di Chang, Guoxian Song, Shen Sang, Tiancheng Zhi, Jing Liu, Linjie Luo, Mohammad Soleymani
The proposed FG-Net achieves a strong generalization ability for heatmap-based AU detection thanks to the generalizable and semantic-rich features extracted from the pre-trained generative model.
1 code implementation • ICCV 2023 • Yanyuan Qiao, Yuankai Qi, Zheng Yu, Jing Liu, Qi Wu
Nevertheless, this poses more challenges than other VLN tasks since it requires agents to infer a navigation plan only based on a short instruction.
no code implementations • 18 Aug 2023 • Yichen Yan, Xingjian He, Wenxuan Wang, Sihan Chen, Jing Liu
In previous approaches, fused vision-language features are directly fed into a decoder and pass through a convolution with a fixed kernel to obtain the result, which follows a similar pattern as traditional image segmentation.
1 code implementation • ICCV 2023 • Dingkang Yang, Shuai Huang, Zhi Xu, Zhenpeng Li, Shunli Wang, Mingcheng Li, Yuzheng Wang, Yang Liu, Kun Yang, Zhaoyu Chen, Yan Wang, Jing Liu, Peixuan Zhang, Peng Zhai, Lihua Zhang
Driver distraction has become a significant cause of severe traffic accidents over the past decade.
1 code implementation • ICCV 2023 • Kun Yang, Dingkang Yang, Jingyu Zhang, Mingcheng Li, Yang Liu, Jing Liu, Hanqi Wang, Peng Sun, Liang Song
In this paper, we propose SCOPE, a novel collaborative perception framework that aggregates the spatio-temporal awareness characteristics across on-road agents in an end-to-end manner.
1 code implementation • 24 Jul 2023 • Peng Wu, Jing Liu, Xiangteng He, Yuxin Peng, Peng Wang, Yanning Zhang
In this context, we propose a novel task called Video Anomaly Retrieval (VAR), which aims to pragmatically retrieve relevant anomalous videos by cross-modalities, e. g., language descriptions and synchronous audios.
1 code implementation • 20 Jul 2023 • Xilei Zhu, Huiyu Duan, Yuqin Cao, Yuxin Zhu, Yucheng Zhu, Jing Liu, Li Chen, Xiongkuo Min, Guangtao Zhai
Omnidirectional videos (ODVs) play an increasingly important role in the application fields of medical, education, advertising, tourism, etc.
1 code implementation • 20 Jul 2023 • Ruiyang Ren, Yuhao Wang, Yingqi Qu, Wayne Xin Zhao, Jing Liu, Hao Tian, Hua Wu, Ji-Rong Wen, Haifeng Wang
In this study, we present an initial analysis of the factual knowledge boundaries of LLMs and how retrieval augmentation affects LLMs on open-domain QA.
1 code implementation • 1 Jul 2023 • Jiarui Wang, Huiyu Duan, Jing Liu, Shi Chen, Xiongkuo Min, Guangtao Zhai
In this paper, in order to get a better understanding of the human visual preferences for AIGIs, a large-scale IQA database for AIGC is established, which is named as AIGCIQA2023.
1 code implementation • 30 Jun 2023 • Zizheng Pan, Jing Liu, Haoyu He, Jianfei Cai, Bohan Zhuang
With extensive experiments on ImageNet-1K, ADE20K, COCO-Stuff-10K and NYUv2, SN-Netv2 demonstrates superior performance over SN-Netv1 on downstream dense predictions and shows strong ability as a flexible vision backbone, achieving great advantages in both training efficiency and deployment flexibility.
1 code implementation • 15 Jun 2023 • Kun Zhang, Le Wu, Guangyi Lv, Enhong Chen, Shulan Ruan, Jing Liu, Zhiqiang Zhang, Jun Zhou, Meng Wang
Then, we propose a novel Relation of Relation Learning Network (R2-Net) for text classification, in which text classification and R2 classification are treated as optimization targets.
1 code implementation • 15 Jun 2023 • Sihan Chen, Xingjian He, Handong Li, Xiaojie Jin, Jiashi Feng, Jing Liu
Due to the limited scale and quality of video-text training corpus, most vision-language foundation models employ image-text datasets for pretraining and primarily focus on modeling visually semantic representations while disregarding temporal semantic representations and correlations.
Ranked #1 on TGIF-Frame on TGIF-QA (using extra training data)
1 code implementation • NeurIPS 2023 • Sihan Chen, Handong Li, Qunbo Wang, Zijia Zhao, Mingzhen Sun, Xinxin Zhu, Jing Liu
Based on the proposed VAST-27M dataset, we train an omni-modality video-text foundational model named VAST, which can perceive and process vision, audio, and subtitle modalities from video, and better support various tasks including vision-text, audio-text, and multi-modal video-text tasks (retrieval, captioning and QA).
Ranked #1 on Image Captioning on COCO Captions (SPICE metric, using extra training data)
no code implementations • 27 May 2023 • Kai Wu, Yujian Betterest Li, Jian Lou, XiaoYu Zhang, Handing Wang, Jing Liu
It is frightening that deep neural networks are vulnerable and sensitive to adversarial attacks, the most common one of which for the services is evasion-based.
1 code implementation • 25 May 2023 • Zijia Zhao, Longteng Guo, Tongtian Yue, Sihan Chen, Shuai Shao, Xinxin Zhu, Zehuan Yuan, Jing Liu
We show that only language-paired two-modality data is sufficient to connect all modalities.
no code implementations • 24 May 2023 • Yichen Yan, Xingjian He, Wenxuan Wan, Jing Liu
However, this task is challenging due to the distinct data properties between text and image, and the randomness introduced by diverse objects and unrestricted language expression.
no code implementations • 22 May 2023 • Xingjian He, Sihan Chen, Fan Ma, Zhicheng Huang, Xiaojie Jin, Zikang Liu, Dongmei Fu, Yi Yang, Jing Liu, Jiashi Feng
Towards this goal, we propose a novel video-text pre-training method dubbed VLAB: Video Language pre-training by feature Adapting and Blending, which transfers CLIP representations to video pre-training tasks and develops unified video multimodal models for a wide range of video-text tasks.
Ranked #1 on Visual Question Answering (VQA) on MSVD-QA (using extra training data)
no code implementations • 19 May 2023 • Wenxuan Wang, Jing Liu, Xingjian He, Yisi Zhang, Chen Chen, Jiachen Shen, Yan Zhang, Jiangyun Li
Referring image segmentation (RIS) is a fundamental vision-language task that intends to segment a desired object from an image based on a given natural language expression.
1 code implementation • 19 May 2023 • Zikang Liu, Sihan Chen, Longteng Guo, Handong Li, Xingjian He, Jing Liu
In this paper, we propose a novel method called Joint QA and DC GEneration (JADE), which utilizes a pre-trained multimodal model and easily-crawled image-text pairs to automatically generate and filter large-scale VQA and dense captioning datasets.
no code implementations • 18 May 2023 • Ruiyang Ren, Wayne Xin Zhao, Jing Liu, Hua Wu, Ji-Rong Wen, Haifeng Wang
Recently, model-based retrieval has emerged as a new paradigm in text retrieval that discards the index in the traditional retrieval model and instead memorizes the candidate corpora using model parameters.
no code implementations • 12 May 2023 • Kai Cheng, Xinhua Zeng, Yang Liu, Tian Wang, Chengxin Pang, Jing Teng, Zhaoyang Xia, Jing Liu
Since the anomaly set is complicated and unbounded, our STHA can adjust its detection ability to adapt to the human detection demands and the complexity degree of anomaly that happened in the history of a scene.
no code implementations • 9 May 2023 • Xuandi Fu, Kanthashree Mysore Sathyendra, Ankur Gandhe, Jing Liu, Grant P. Strimel, Ross McGowan, Athanasios Mouchtaris
Prior approaches typically relied on subword encoders for encoding the bias phrases.
no code implementations • 30 Apr 2023 • Minghui Yang, Jing Liu, Zhiwei Yang, Zhaoyang Wu
Focusing on more effective and comprehensive anomaly detection, we propose a network based on self-supervised learning and self-attentive graph convolution (SLSG) for anomaly detection.
Ranked #3 on Anomaly Detection on MVTec LOCO AD
no code implementations • 24 Apr 2023 • XiaoBin Li, Kai Wu, XiaoYu Zhang, Handing Wang, Jing Liu
To achieve this, 1) drawing on the mechanism of genetic algorithm, we propose a deep neural network framework called B2Opt, which has a stronger representation of optimization strategies based on survival of the fittest; 2) B2Opt can utilize the cheap surrogate functions of the target task to guide the design of the efficient optimization strategies.
no code implementations • 21 Apr 2023 • Wenxuan Wang, Jiachen Shen, Chen Chen, Jianbo Jiao, Jing Liu, Yan Zhang, Shanshan Song, Jiangyun Li
In this paper, we present the study on parameter-efficient transfer learning for medical volumetric segmentation and propose a new framework named Med-Tuning based on intra-stage feature enhancement and inter-stage feature interaction.
no code implementations • 19 Apr 2023 • Kai Wu, Penghui Liu, Jing Liu
Evolutionary algorithms (EAs) have emerged as a powerful framework for optimization, especially for black-box optimization.
1 code implementation • 17 Apr 2023 • Sihan Chen, Xingjian He, Longteng Guo, Xinxin Zhu, Weining Wang, Jinhui Tang, Jing Liu
Different from widely-studied vision-language pretraining models, VALOR jointly models relationships of vision, audio and language in an end-to-end manner.
Ranked #1 on Video Captioning on VATEX (using extra training data)
no code implementations • CVPR 2023 • Zhiwei Yang, Jing Liu, Zhaoyang Wu, Peng Wu, Xiaotao Liu
Video anomaly detection (VAD) is a significant computer vision problem.
no code implementations • 5 Apr 2023 • Donglai Wei, Sipeng Zhang, Tong Yang, Yang Liu, Jing Liu
On the other hand, the Masking Caption Modeling (MCM) loss leverages a masked captions prediction task to establish detailed and generic relationships between textual and visual parts.
no code implementations • 3 Apr 2023 • Saumya Y. Sahai, Jing Liu, Thejaswi Muniyappa, Kanthashree M. Sathyendra, Anastasios Alexandridis, Grant P. Strimel, Ross McGowan, Ariya Rastrow, Feng-Ju Chang, Athanasios Mouchtaris, Siegfried Kunzmann
We present dual-attention neural biasing, an architecture designed to boost Wake Words (WW) recognition and improve inference time latency on speech recognition tasks.
no code implementations • 30 Mar 2023 • Rahul Pandey, Roger Ren, Qi Luo, Jing Liu, Ariya Rastrow, Ankur Gandhe, Denis Filimonov, Grant Strimel, Andreas Stolcke, Ivan Bulyko
End-to-End (E2E) automatic speech recognition (ASR) systems used in voice assistants often have difficulties recognizing infrequent words personalized to the user, such as names and places.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
1 code implementation • 29 Mar 2023 • Jiawei Liu, Weining Wang, Sihan Chen, Xinxin Zhu, Jing Liu
In this work, we concentrate on a rarely investigated problem of text guided sounding video generation and propose the Sounding Video Generator (SVG), a unified framework for generating realistic videos along with audio signals.
no code implementations • CVPR 2023 • Hongyi Xu, Guoxian Song, Zihang Jiang, Jianfeng Zhang, Yichun Shi, Jing Liu, WanChun Ma, Jiashi Feng, Linjie Luo
We present OmniAvatar, a novel geometry-guided 3D head synthesis model trained from in-the-wild unstructured images that is capable of synthesizing diverse identity-preserved 3D heads with compelling dynamic details under full disentangled control over camera poses, facial expressions, head shapes, articulated neck and jaw poses.
no code implementations • 24 Mar 2023 • Guoxian Song, Hongyi Xu, Jing Liu, Tiancheng Zhi, Yichun Shi, Jianfeng Zhang, Zihang Jiang, Jiashi Feng, Shen Sang, Linjie Luo
Capitalizing on the recent advancement of 3D-aware GAN models, we perform \emph{guided transfer learning} on a pretrained 3D GAN generator to produce multi-view-consistent stylized renderings.
1 code implementation • CVPR 2023 • Zhaodi Zhang, Zhiyi Xue, Yang Chen, Si Liu, Yueling Zhang, Jing Liu, Min Zhang
Via abstraction, all perturbed images are mapped into intervals before feeding into neural networks for training.
1 code implementation • 14 Mar 2023 • ZiCheng Zhang, Wei Sun, Yingjie Zhou, Jun Jia, Zhichao Zhang, Jing Liu, Xiongkuo Min, Guangtao Zhai
Computer graphics images (CGIs) are artificially generated by means of computer programs and are widely perceived under various scenarios, such as games, streaming media, etc.
2 code implementations • CVPR 2023 • Mingzhen Sun, Weining Wang, Xinxin Zhu, Jing Liu
Experimental results demonstrate that our method achieves new state-of-the-art performance on five challenging benchmarks for video prediction and unconditional video generation: BAIR, RoboNet, KTH, KITTI and UCF101.
no code implementations • 28 Feb 2023 • Yanchen Liu, Jing Yan, Yan Chen, Jing Liu, Hua Wu
Recent studies reveal that various biases exist in different NLP tasks, and over-reliance on biases results in models' poor generalization ability and low adversarial robustness.
1 code implementation • 27 Feb 2023 • Jing Liu, Tongya Zheng, Guanzheng Zhang, Qinfen Hao
It then provides a comprehensive summary of three types of Graph-based Knowledge Distillation methods, namely Graph-based Knowledge Distillation for deep neural networks (DKD), Graph-based Knowledge Distillation for GNNs (GKD), and Self-Knowledge Distillation based Graph-based Knowledge Distillation (SKD).
no code implementations • 23 Feb 2023 • Kun Yang, Jing Liu, Dingkang Yang, Hanqi Wang, Peng Sun, Yanni Zhang, Yan Liu, Liang Song
With the rapid development of intelligent transportation system applications, a tremendous amount of multi-view video data has emerged to enhance vehicle perception.
no code implementations • 14 Feb 2023 • Minghao Liu, Zeyu Cheng, Shen Sang, Jing Liu, James Davis
Compared to direct annotation of labels, the proposed method: produces higher annotator agreements, causes machine learning to generates more consistent predictions, and only requires a marginal cost to add new rendering systems.
1 code implementation • 10 Feb 2023 • Yang Liu, Dingkang Yang, Yan Wang, Jing Liu, Jun Liu, Azzedine Boukerche, Peng Sun, Liang Song
Video Anomaly Detection (VAD) serves as a pivotal technology in the intelligent surveillance systems, enabling the temporal or spatial identification of anomalous events within videos.
no code implementations • 2 Feb 2023 • Bohan Zhuang, Jing Liu, Zizheng Pan, Haoyu He, Yuetian Weng, Chunhua Shen
Recent advances in Transformers have come with a huge requirement on computing resources, highlighting the importance of developing efficient training techniques to make Transformer training faster, at lower cost, and to higher accuracy by the efficient use of computation and memory resources.
no code implementations • 25 Jan 2023 • Jing Liu, Hemant Singh, Saber Elsayed, Robert Hunjet, Hussein Abbass
Robotic shepherding is a bio-inspired approach to autonomously guiding a swarm of agents towards a desired location.
no code implementations • 19 Jan 2023 • Yuanyuan Li, Kai Wu, Jing Liu
Our proposal is competitive in identifying the change points and discovering governing differential equations in three hybrid systems and two switching linear systems.
1 code implementation • journal 2023 • Yepeng Tang, Weining Wang, Yanwu Yang, Chunjie Zhang, Jing Liu
The TCM aggregates the temporal context information and provides features for the IBM and the FPBM.
no code implementations • ICCV 2023 • Dan Liu, Jin Hou, Shaoli Huang, Jing Liu, Yuxin He, Bochuan Zheng, Jifeng Ning, Jingdong Zhang
To break the deadlock, we present LoTE-Animal, a large-scale endangered animal dataset collected over 12 years, to foster the application of deep learning in rare species conservation.
no code implementations • ICCV 2023 • Yefei He, Zhenyu Lou, Luoming Zhang, Jing Liu, Weijia Wu, Hong Zhou, Bohan Zhuang
To solve this, we propose Softmax-aware Binarization, which dynamically adapts to the data distribution and reduces the error caused by binarization.
no code implementations • 16 Dec 2022 • Yeqing Lin, Feng Shu, Rongen Dong, Riqing Chen, Siling Feng, Weiping Shi, Jing Liu, Jiangzhou Wang
In this paper, in order to boost the achievable rate of user in such a wireless network, three enhanced-rate iterative beamforming methods are proposed by designing the amplifying factors and the corresponding phases at active IRS.
no code implementations • 5 Dec 2022 • Feng Shu, Jing Liu, Yeqing Lin, Yang Liu, Zhilin Chen, Xuehui Wang, Rongen Dong, Jiangzhou Wang
To fully exploit the amplifying gain achieved by active IRS, two high-rate methods, maximum ratio reflecting (MRR) and selective ratio reflecting (SRR) are presented, which are motivated by maximum ratio combining and selective ratio combining.
no code implementations • 30 Nov 2022 • Jie Yan, Jing Liu, Ji Qi, Zhong-Yuan Zhang
Federated clustering (FC) is an essential extension of centralized clustering designed for the federated setting, wherein the challenge lies in constructing a global similarity measure without the need to share private data.
no code implementations • 28 Nov 2022 • Huixin Ma, Kai Wu, Handing Wang, Jing Liu
In this way, our proposal can better keep the advantages of previous community detection results and transfer them to the next task.
2 code implementations • 27 Nov 2022 • Wayne Xin Zhao, Jing Liu, Ruiyang Ren, Ji-Rong Wen
With powerful PLMs, we can effectively learn the representations of queries and texts in the latent representation space, and further construct the semantic matching function between the dense vectors for relevance modeling.
no code implementations • 15 Nov 2022 • Shen Sang, Tiancheng Zhi, Guoxian Song, Minghao Liu, Chunpong Lai, Jing Liu, Xiang Wen, James Davis, Linjie Luo
We propose a novel self-supervised learning framework to create high-quality stylized 3D avatars with a mix of continuous and discrete parameters.
no code implementations • 14 Nov 2022 • Yefei He, Zhenyu Lou, Luoming Zhang, Jing Liu, Weijia Wu, Hong Zhou, Bohan Zhuang
To solve this, we propose Softmax-aware Binarization, which dynamically adapts to the data distribution and reduces the error caused by binarization.
1 code implementation • 14 Nov 2022 • Mengyang Zhao, Xinhua Zeng, Yang Liu, Jing Liu, Di Li, Xing Hu, Chengxin Pang
Existing unsupervised VAD methods tend to learn normality from training sets consisting of only normal videos and regard instances deviating from such normality as anomalies.
2 code implementations • 7 Nov 2022 • Andrey Ignatov, Radu Timofte, Maurizio Denna, Abdel Younes, Ganzorig Gankhuyag, Jingang Huh, Myeong Kyun Kim, Kihwan Yoon, Hyeon-Cheol Moon, Seungho Lee, Yoonsik Choe, Jinwoo Jeong, Sungjei Kim, Maciej Smyl, Tomasz Latkowski, Pawel Kubik, Michal Sokolski, Yujie Ma, Jiahao Chao, Zhou Zhou, Hongfan Gao, Zhengfeng Yang, Zhenbing Zeng, Zhengyang Zhuge, Chenghua Li, Dan Zhu, Mengdi Sun, Ran Duan, Yan Gao, Lingshun Kong, Long Sun, Xiang Li, Xingdong Zhang, Jiawei Zhang, Yaqi Wu, Jinshan Pan, Gaocheng Yu, Jin Zhang, Feng Zhang, Zhe Ma, Hongbin Wang, Hojin Cho, Steve Kim, Huaen Li, Yanbo Ma, Ziwei Luo, Youwei Li, Lei Yu, Zhihong Wen, Qi Wu, Haoqiang Fan, Shuaicheng Liu, Lize Zhang, Zhikai Zong, Jeremy Kwon, Junxi Zhang, Mengyuan Li, Nianxiang Fu, Guanchen Ding, Han Zhu, Zhenzhong Chen, Gen Li, Yuanfan Zhang, Lei Sun, Dafeng Zhang, Neo Yang, Fitz Liu, Jerry Zhao, Mustafa Ayazoglu, Bahri Batuhan Bilecen, Shota Hirose, Kasidis Arunruangsirilert, Luo Ao, Ho Chun Leung, Andrew Wei, Jie Liu, Qiang Liu, Dahai Yu, Ao Li, Lei Luo, Ce Zhu, Seongmin Hong, Dongwon Park, Joonhee Lee, Byeong Hyun Lee, Seunggyu Lee, Se Young Chun, Ruiyuan He, Xuhao Jiang, Haihang Ruan, Xinjian Zhang, Jing Liu, Garas Gendy, Nabil Sabor, Jingchao Hou, Guanghui He
While numerous solutions have been proposed for this problem in the past, they are usually not compatible with low-power mobile NPUs having many computational and memory constraints.
1 code implementation • 29 Oct 2022 • Jie Yan, Jing Liu, Ji Qi, Zhong-Yuan Zhang
Federated clustering (FC) is an extension of centralized clustering in federated settings.
no code implementations • 9 Oct 2022 • Zijia Zhao, Longteng Guo, Xingjian He, Shuai Shao, Zehuan Yuan, Jing Liu
Our method performs joint masking on image-text input and integrates both implicit and explicit targets for the masked signals to recover.
1 code implementation • 19 Sep 2022 • Jing Liu, Zizheng Pan, Haoyu He, Jianfei Cai, Bohan Zhuang
To this end, we propose a new binarization paradigm customized to high-dimensional softmax attention via kernelized hashing, called EcoFormer, to map the original queries and keys into low-dimensional binary codes in Hamming space.
no code implementations • 23 Aug 2022 • Jing Liu, Jianfei Cai, Bohan Zhuang
During architecture search, these methods focus on finding architectures on the Pareto frontier of performance and resource consumption, which forms a gap between training and deployment.
no code implementations • 21 Aug 2022 • Zhaodi Zhang, Yiting Wu, Si Liu, Jing Liu, Min Zhang
Considerable efforts have been devoted to finding the so-called tighter approximations to obtain more precise verification results.
no code implementations • 28 Jul 2022 • Yaozong Shen, Lijie Wang, Ying Chen, Xinyan Xiao, Jing Liu, Hua Wu
To fill in the gap, we propose a novel evaluation benchmark providing with both English and Chinese annotated data.
no code implementations • 27 Jul 2022 • Yang Liu, Jing Liu, Mengyang Zhao, Dingkang Yang, Xiaoguang Zhu, Liang Song
Video anomaly detection is a challenging task in the computer vision community.
no code implementations • 25 Jul 2022 • Jing Liu, Tongya Zheng, Qinfen Hao
To the best of our knowledge, we are the first to propose a HIgh-order RElational (HIRE) knowledge distillation framework on heterogeneous graphs, which can significantly boost the prediction performance regardless of model architectures of HGNNs.
1 code implementation • 22 Jul 2022 • Zhiwei Yang, Peng Wu, Jing Liu, Xiaotao Liu
Existing methods for anomaly detection based on memory-augmented autoencoder (AE) have the following drawbacks: (1) Establishing a memory bank requires additional memory space.
1 code implementation • NAACL (BEA) 2022 • Sterling Alic, Dorottya Demszky, Zid Mancenido, Jing Liu, Heather Hill, Dan Jurafsky
Responsive teaching is a highly effective strategy that promotes student learning.
no code implementations • 28 Jun 2022 • David R. Johnson, Nathan B. Geldner, Jing Liu, Uris Lantz Baldos, Thomas Hertel
Biobased energy, particularly corn starch-based ethanol and other liquid renewable fuels, are a major element of federal and state energy policies in the United States.
no code implementations • 15 Jun 2022 • Jing Liu, Laura Bowling, Christopher Kucharik, Sadia Jame, Uris Baldos, Larissa Jarvis, Navin Ramankutty, Thomas Hertel
Reducing the size of the hypoxic zone in the Gulf of Mexico has proven to be a challenging task.
no code implementations • 26 May 2022 • Kanthashree Mysore Sathyendra, Thejaswi Muniyappa, Feng-Ju Chang, Jing Liu, Jinru Su, Grant P. Strimel, Athanasios Mouchtaris, Siegfried Kunzmann
Personal rare word recognition in end-to-end Automatic Speech Recognition (E2E ASR) models is a challenge due to the lack of training data.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
1 code implementation • 25 May 2022 • Yanrui Du, Jing Yan, Yan Chen, Jing Liu, Sendong Zhao, Qiaoqiao She, Hua Wu, Haifeng Wang, Bing Qin
In this study, we focus on the spurious correlation between word features and labels that models learn from the biased data distribution of training data.
2 code implementations • 11 May 2022 • Yawei Li, Kai Zhang, Radu Timofte, Luc van Gool, Fangyuan Kong, Mingxi Li, Songwei Liu, Zongcai Du, Ding Liu, Chenhui Zhou, Jingyi Chen, Qingrui Han, Zheyuan Li, Yingqi Liu, Xiangyu Chen, Haoming Cai, Yu Qiao, Chao Dong, Long Sun, Jinshan Pan, Yi Zhu, Zhikai Zong, Xiaoxiao Liu, Zheng Hui, Tao Yang, Peiran Ren, Xuansong Xie, Xian-Sheng Hua, Yanbo Wang, Xiaozhong Ji, Chuming Lin, Donghao Luo, Ying Tai, Chengjie Wang, Zhizhong Zhang, Yuan Xie, Shen Cheng, Ziwei Luo, Lei Yu, Zhihong Wen, Qi Wu1, Youwei Li, Haoqiang Fan, Jian Sun, Shuaicheng Liu, Yuanfei Huang, Meiguang Jin, Hua Huang, Jing Liu, Xinjian Zhang, Yan Wang, Lingshun Long, Gen Li, Yuanfan Zhang, Zuowei Cao, Lei Sun, Panaetov Alexander, Yucong Wang, Minjie Cai, Li Wang, Lu Tian, Zheyuan Wang, Hongbing Ma, Jie Liu, Chao Chen, Yidong Cai, Jie Tang, Gangshan Wu, Weiran Wang, Shirui Huang, Honglei Lu, Huan Liu, Keyan Wang, Jun Chen, Shi Chen, Yuchun Miao, Zimo Huang, Lefei Zhang, Mustafa Ayazoğlu, Wei Xiong, Chengyi Xiong, Fei Wang, Hao Li, Ruimian Wen, Zhijing Yang, Wenbin Zou, Weixin Zheng, Tian Ye, Yuncheng Zhang, Xiangzhen Kong, Aditya Arora, Syed Waqas Zamir, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, Dandan Gaoand Dengwen Zhouand Qian Ning, Jingzhu Tang, Han Huang, YuFei Wang, Zhangheng Peng, Haobo Li, Wenxue Guan, Shenghua Gong, Xin Li, Jun Liu, Wanjun Wang, Dengwen Zhou, Kun Zeng, Hanjiang Lin, Xinyu Chen, Jinsheng Fang
The aim was to design a network for single image super-resolution that achieved improvement of efficiency measured according to several metrics including runtime, parameters, FLOPs, activations, and memory consumption while at least maintaining the PSNR of 29. 00dB on DIV2K validation set.
4 code implementations • 2 May 2022 • Minghui Yang, Peng Wu, Jing Liu, Hui Feng
By comparing the similarities and differences between input samples and memory samples in the memory pool to give effective guesses about abnormal regions; In the inference phase, MemSeg directly determines the abnormal regions of the input image in an end-to-end manner.
Ranked #14 on Anomaly Detection on MVTec AD
no code implementations • 27 Apr 2022 • Ruiyang Ren, Yingqi Qu, Jing Liu, Wayne Xin Zhao, Qifei Wu, Yuchen Ding, Hua Wu, Haifeng Wang, Ji-Rong Wen
Recent years have witnessed the significant advance in dense retrieval (DR) based on powerful pre-trained language models (PLM).
1 code implementation • 7 Apr 2022 • Chao Wang, Jiaxuan Zhao, Lingling Li, Licheng Jiao, Jing Liu, Kai Wu
Influence maximization is a crucial issue for mining the deep information of social networks, which aims to select a seed set from the network to maximize the number of influenced nodes.
2 code implementations • CVPR 2023 • Haoyu He, Jianfei Cai, Zizheng Pan, Jing Liu, Jing Zhang, DaCheng Tao, Bohan Zhuang
In this paper, we propose a simple yet effective query design for semantic segmentation termed Dynamic Focus-aware Positional Queries (DFPQ), which dynamically generates positional queries conditioned on the cross-attention scores from the preceding decoder block and the positional encodings for the corresponding image features, simultaneously.
Ranked #21 on Semantic Segmentation on ADE20K
no code implementations • 1 Apr 2022 • Xuandi Fu, Feng-Ju Chang, Martin Radfar, Kai Wei, Jing Liu, Grant P. Strimel, Kanthashree Mysore Sathyendra
In addition, the NLU model in the two-stage system is not streamable, as it must wait for the audio segments to complete processing, which ultimately impacts the latency of the SLU system.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +4
2 code implementations • 19 Mar 2022 • Yifu Qiu, Hongyu Li, Yingqi Qu, Ying Chen, Qiaoqiao She, Jing Liu, Hua Wu, Haifeng Wang
In this paper, we present DuReader_retrieval, a large-scale Chinese dataset for passage retrieval.
1 code implementation • 4 Jan 2022 • Chao Wang, Kai Wu, Jing Liu
Inspired by the characteristic of pairwise learning, the cheap AUC optimization task with a small-scale dataset sampled from the large-scale dataset is constructed to promote the AUC accuracy of the original, large-scale, and expensive AUC optimization task.
1 code implementation • 4 Jan 2022 • Kai Wu, Chao Wang, Junyuan Chen, Jing Liu
Community detection (CD) from dynamics and network reconstruction (NR) from dynamics are natural synergistic tasks that motivate the proposed evolutionary multitasking NR and CD framework, called network collaborator (NC).
1 code implementation • 16 Dec 2021 • Hongyu Zhu, Yan Chen, Jing Yan, Jing Liu, Yu Hong, Ying Chen, Hua Wu, Haifeng Wang
For this purpose, we create a Chinese dataset namely DuQM which contains natural questions with linguistic perturbations to evaluate the robustness of question matching models.
no code implementations • 13 Dec 2021 • Kai Wei, Thanh Tran, Feng-Ju Chang, Kanthashree Mysore Sathyendra, Thejaswi Muniyappa, Jing Liu, Anirudh Raju, Ross McGowan, Nathan Susanj, Ariya Rastrow, Grant P. Strimel
Recent years have seen significant advances in end-to-end (E2E) spoken language understanding (SLU) systems, which directly predict intents and slots from spoken audio.
Natural Language Understanding Spoken Language Understanding
4 code implementations • 24 Nov 2021 • Jing Liu, Jianfei Cai, Bohan Zhuang
However, the abrupt changes in quantized weights during training often lead to severe loss fluctuations and result in a sharp loss landscape, making the gradients unstable and thus degrading the performance.
Ranked #196 on Image Classification on CIFAR-100
3 code implementations • 23 Nov 2021 • Haoyu He, Jianfei Cai, Jing Liu, Zizheng Pan, Jing Zhang, DaCheng Tao, Bohan Zhuang
Relying on the single-path space, we introduce learnable binary gates to encode the operation choices in MSA layers.
Ranked #18 on Efficient ViTs on ImageNet-1K (with DeiT-T)
3 code implementations • 22 Nov 2021 • Zizheng Pan, Peng Chen, Haoyu He, Jing Liu, Jianfei Cai, Bohan Zhuang
While Transformers have delivered significant performance improvements, training such networks is extremely memory intensive owing to storing all intermediate activations that are needed for gradient computation during backpropagation, especially for long sequences.
no code implementations • 5 Nov 2021 • Feng-Ju Chang, Jing Liu, Martin Radfar, Athanasios Mouchtaris, Maurizio Omologo, Ariya Rastrow, Siegfried Kunzmann
We also leverage both BLSTM and pretrained BERT based models to encode contextual data and guide the network training.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
1 code implementation • EMNLP 2021 • Ruiyang Ren, Yingqi Qu, Jing Liu, Wayne Xin Zhao, Qiaoqiao She, Hua Wu, Haifeng Wang, Ji-Rong Wen
In this paper, we propose a novel joint training approach for dense passage retrieval and passage re-ranking.
no code implementations • 29 Sep 2021 • Jing Liu, Chulin Xie, Krishnaram Kenthapadi, Oluwasanmi O Koyejo, Bo Li
Vertical Federated Learning (VFL) is a distributed learning paradigm that allows multiple agents to jointly train a global model when each agent holds a different subset of features for the same sample(s).
no code implementations • 6 Sep 2021 • Xingjian He, Weining Wang, Zhiyong Xu, Hao Wang, Jie Jiang, Jing Liu
Compared with image scene parsing, video scene parsing introduces temporal information, which can effectively improve the consistency and accuracy of prediction.
1 code implementation • Findings (ACL) 2021 • Ruiyang Ren, Shangwen Lv, Yingqi Qu, Jing Liu, Wayne Xin Zhao, Qiaoqiao She, Hua Wu, Haifeng Wang, Ji-Rong Wen
Recently, dense passage retrieval has become a mainstream approach to finding relevant information in various natural language processing tasks.
no code implementations • 3 Aug 2021 • Jing Liu, Bohan Zhuang, Mingkui Tan, Xu Liu, Dinh Phung, Yuanqing Li, Jianfei Cai
More critically, EAS is able to find compact architectures within 0. 1 second for 50 deployment scenarios.
1 code implementation • ACL 2021 • Hongxuan Tang, Hongyu Li, Jing Liu, Yu Hong, Hua Wu, Haifeng Wang
Machine reading comprehension (MRC) is a crucial task in natural language processing and has achieved remarkable advancements.
no code implementations • 27 Jul 2021 • Luyu Qiu, Yi Yang, Caleb Chen Cao, Jing Liu, Yueyuan Zheng, Hilary Hei Ting Ngai, Janet Hsiao, Lei Chen
Besides, our solution also resolves a fundamental problem with the faithfulness indicator, a commonly used evaluation metric of XAI algorithms that appears to be sensitive to the OoD issue.
Explainable artificial intelligence Explainable Artificial Intelligence (XAI)
1 code implementation • 26 Jul 2021 • Peng Wu, Xiangteng He, Mingqian Tang, Yiliang Lv, Jing Liu
Based on these, we naturally construct hierarchical representations in the individual-local-global manner, where the individual level focuses on the alignment between frame and word, local level focuses on the alignment between video clip and textual context, and global level focuses on the alignment between the whole video and text.
1 code implementation • ACM Transactions on Graphics 2021 • Guoxian Song, Linjie Luo, Jing Liu, Wan-Chun Ma, Chun-Pong Lai, Chuanxia Zheng, Tat-Jen Cham
While substantial progress has been made in automated stylization, generating high quality stylistic portraits is still a challenge, and even the recent popular Toonify suffers from several artifacts when used on real input images.
2 code implementations • 1 Jul 2021 • Jing Liu, Xinxin Zhu, Fei Liu, Longteng Guo, Zijia Zhao, Mingzhen Sun, Weining Wang, Hanqing Lu, Shiyu Zhou, Jiajun Zhang, Jinqiao Wang
In this paper, we propose an Omni-perception Pre-Trainer (OPT) for cross-modal understanding and generation, by jointly modeling visual, text and audio resources.
Ranked #1 on Image Retrieval on Localized Narratives
1 code implementation • 24 Jun 2021 • Jing Liu, Sujie Li, Jiang Zhang, Pan Zhang
Despite the great potential, however, existing tensor network models for unsupervised machine learning only work as a proof of principle, as their performance is much worse than the standard models such as restricted Boltzmann machines and neural networks.
no code implementations • 11 Jun 2021 • Jing Liu, Rupak Vignesh Swaminathan, Sree Hari Krishnan Parthasarathi, Chunchuan Lyu, Athanasios Mouchtaris, Siegfried Kunzmann
We present results from Alexa speech teams on semi-supervised learning (SSL) of acoustic models (AM) with experiments spanning over 3000 hours of GPU time, making our study one of the largest of its kind.
1 code implementation • ACL 2021 • Dorottya Demszky, Jing Liu, Zid Mancenido, Julie Cohen, Heather Hill, Dan Jurafsky, Tatsunori Hashimoto
In conversation, uptake happens when a speaker builds on the contribution of their interlocutor by, for example, acknowledging, repeating or reformulating what they have said.
no code implementations • 31 May 2021 • Yi Xu, Minyi Zhao, Jing Liu, Xinjian Zhang, Longwen Gao, Shuigeng Zhou, Huyang Sun
Many deep learning based video compression artifact removal algorithms have been proposed to recover high-quality videos from low-quality compressed videos.
no code implementations • 31 May 2021 • Duanshun Li, Jing Liu, Jinsung Jeon, Seoyoung Hong, Thai Le, Dongwon Lee, Noseong Park
On top of the prediction models, we define a budget-constrained flight frequency optimization problem to maximize the market influence over 2, 262 routes.
2 code implementations • 29 May 2021 • Zizheng Pan, Bohan Zhuang, Haoyu He, Jing Liu, Jianfei Cai
Transformers have become one of the dominant architectures in deep learning, particularly as a powerful alternative to convolutional neural networks (CNNs) in computer vision.
1 code implementation • 21 Apr 2021 • Ren Yang, Radu Timofte, Jing Liu, Yi Xu, Xinjian Zhang, Minyi Zhao, Shuigeng Zhou, Kelvin C. K. Chan, Shangchen Zhou, Xiangyu Xu, Chen Change Loy, Xin Li, Fanglong Liu, He Zheng, Lielin Jiang, Qi Zhang, Dongliang He, Fu Li, Qingqing Dang, Yibin Huang, Matteo Maggioni, Zhongqian Fu, Shuai Xiao, Cheng Li, Thomas Tanay, Fenglong Song, Wentao Chao, Qiang Guo, Yan Liu, Jiang Li, Xiaochao Qu, Dewang Hou, Jiayu Yang, Lyn Jiang, Di You, Zhenyu Zhang, Chong Mou, Iaroslav Koshelev, Pavel Ostyakov, Andrey Somov, Jia Hao, Xueyi Zou, Shijie Zhao, Xiaopeng Sun, Yiting Liao, Yuanzhi Zhang, Qing Wang, Gen Zhan, Mengxi Guo, Junlin Li, Ming Lu, Zhan Ma, Pablo Navarrete Michelini, Hai Wang, Yiyun Chen, Jingyu Guo, Liliang Zhang, Wenming Yang, Sijung Kim, Syehoon Oh, Yucong Wang, Minjie Cai, Wei Hao, Kangdi Shi, Liangyan Li, Jun Chen, Wei Gao, Wang Liu, XiaoYu Zhang, Linjie Zhou, Sixin Lin, Ru Wang
This paper reviews the first NTIRE challenge on quality enhancement of compressed video, with a focus on the proposed methods and results.
no code implementations • 2 Apr 2021 • Kuan Zhu, Haiyun Guo, Shiliang Zhang, YaoWei Wang, Gaopan Huang, Honglin Qiao, Jing Liu, Jinqiao Wang, Ming Tang
In this paper, we introduce an alignment scheme in Transformer architecture for the first time and propose the Auto-Aligned Transformer (AAformer) to automatically locate both the human parts and non-human ones at patch-level.
2 code implementations • ICCV 2021 • Zizheng Pan, Bohan Zhuang, Jing Liu, Haoyu He, Jianfei Cai
However, the routine of the current ViT model is to maintain a full-length patch sequence during inference, which is redundant and lacks hierarchical representation.
Ranked #22 on Efficient ViTs on ImageNet-1K (with DeiT-T)
1 code implementation • 17 Feb 2021 • Hao Wang, Weining Wang, Jing Liu
Video semantic segmentation requires to utilize the complex temporal relations between frames of the video sequence.
Ranked #1 on Video Semantic Segmentation on Cityscapes val
no code implementations • 26 Jan 2021 • Wei Liu, Sihan Chen, Longteng Guo, Xinxin Zhu, Jing Liu
Besides, we provide detailed visualizations of the self-attention between patches in the encoder and the "words-to-patches" attention in the decoder thanks to the full Transformer architecture.
no code implementations • 26 Jan 2021 • Sihan Chen, Xinxin Zhu, Wei Liu, Xingjian He, Jing Liu
Depth information matters in RGB-D semantic segmentation task for providing additional geometric information to color images.
no code implementations • 24 Jan 2021 • Longteng Guo, Jing Liu, Xinxin Zhu, Hanqing Lu
These models are autoregressive in that they generate each word by conditioning on previously generated words, which leads to heavy latency during inference.
no code implementations • 13 Jan 2021 • Jing Liu, Bohan Zhuang, Peng Chen, Chunhua Shen, Jianfei Cai, Mingkui Tan
By jointly training the binary gates in conjunction with network parameters, the compression configurations of each layer can be automatically determined.
no code implementations • ICCV 2021 • Fei Liu, Jing Liu, Weining Wang, Hanqing Lu
Specifically, we present a novel graph memory mechanism to perform relational reasoning, and further develop two types of graph memory: a) visual graph memory that leverages visual information of video for relational reasoning; b) semantic graph memory that is specifically designed to explicitly leverage semantic knowledge contained in the classes and attributes of video objects, and perform relational reasoning in the semantic space.
no code implementations • 31 Dec 2020 • Yuefeng Di, Jialong Wang, Ruiyu Zhou, Ligong Bian, Rong-Gen Cai, Jing Liu
We perform the three dimensional lattice simulation of the magnetic field and gravitational wave productions from bubble collisions during the first-order electroweak phase transition.
Cosmology and Nongalactic Astrophysics High Energy Physics - Lattice High Energy Physics - Phenomenology
no code implementations • 16 Dec 2020 • Xinxin Zhu, Weining Wang, Longteng Guo, Jing Liu
The whole process involves a visual understanding module and a language generation module, which brings more challenges to the design of deep neural networks than other tasks.
1 code implementation • 9 Dec 2020 • Jing Liu, Jiaxiang Wang, Weikang Wang, Yuting Su
In this paper, we investigate the complimentary roles of spatial and temporal information and propose a novel dynamic spatiotemporal network (DS-Net) for more effective fusion of spatiotemporal information.
1 code implementation • NAACL 2021 • Yingqi Qu, Yuchen Ding, Jing Liu, Kai Liu, Ruiyang Ren, Wayne Xin Zhao, daxiang dong, Hua Wu, Haifeng Wang
In open-domain question answering, dense passage retrieval has become a new paradigm to retrieve relevant passages for finding answers.
Ranked #4 on Passage Retrieval on Natural Questions
no code implementations • 21 Sep 2020 • Yixin Liu, Yong Guo, Zichang Liu, Haohua Liu, Jingjie Zhang, Zejun Chen, Jing Liu, Jian Chen
To address this issue, given a target compression rate for the whole model, one can search for the optimal compression rate for each layer.
no code implementations • 21 Aug 2020 • Jing Liu, Aditya Deshmukh, Venugopal V. Veeravalli
We study the robust mean estimation problem in high dimensions, where $\alpha <0. 5$ fraction of the data points can be arbitrarily corrupted.
1 code implementation • TNNLS 2020 • Jun Fu, Jing Liu, Jie Jiang, Yong Li, Yongjun Bao, Hanqing Lu
We conduct extensive experiments to validate the effectiveness of our network and achieve new state-of-the-art segmentation performance on four challenging scene segmentation data sets, i. e., Cityscapes, ADE20K, PASCAL Context, and COCO Stuff data sets.
Ranked #8 on Semantic Segmentation on COCO-Stuff test
1 code implementation • CVPR 2021 • Peng Chen, Jing Liu, Bohan Zhuang, Mingkui Tan, Chunhua Shen
Network quantization allows inference to be conducted using low-precision arithmetic for improved inference efficiency of deep neural networks on edge devices.
1 code implementation • ECCV 2020 • Peng Wu, Jing Liu, Yujia Shi, Yujia Sun, Fangtao Shao, Zhaoyang Wu, Zhiwei Yang
Violence detection has been studied in computer vision for years.
no code implementations • 19 Jun 2020 • Shuai Han, Wenbo Zhou, Jing Liu, Shuai Lü
Effective exploration for noisy networks is one of the most important issues in deep reinforcement learning.
no code implementations • 10 May 2020 • Longteng Guo, Jing Liu, Xinxin Zhu, Xingjian He, Jie Jiang, Hanqing Lu
In this paper, we propose a Non-Autoregressive Image Captioning (NAIC) model with a novel training paradigm: Counterfactuals-critical Multi-Agent Learning (CMAL).
no code implementations • 7 May 2020 • Codruta O. Ancuti, Cosmin Ancuti, Florin-Alexandru Vasluianu, Radu Timofte, Jing Liu, Haiyan Wu, Yuan Xie, Yanyun Qu, Lizhuang Ma, Ziling Huang, Qili Deng, Ju-Chin Chao, Tsung-Shan Yang, Peng-Wen Chen, Po-Min Hsu, Tzu-Yi Liao, Chung-En Sun, Pei-Yuan Wu, Jeonghyeok Do, Jongmin Park, Munchurl Kim, Kareem Metwaly, Xuelu Li, Tiantong Guo, Vishal Monga, Mingzhao Yu, Venkateswararao Cherukuri, Shiue-Yuan Chuang, Tsung-Nan Lin, David Lee, Jerome Chang, Zhan-Han Wang, Yu-Bang Chang, Chang-Hong Lin, Yu Dong, Hong-Yu Zhou, Xiangzhen Kong, Sourya Dipta Das, Saikat Dutta, Xuan Zhao, Bing Ouyang, Dennis Estrada, Meiqi Wang, Tianqi Su, Siyi Chen, Bangyong Sun, Vincent Whannou de Dravo, Zhe Yu, Pratik Narang, Aryan Mehra, Navaneeth Raghunath, Murari Mandal
We focus on the proposed solutions and their results evaluated on NH-Haze, a novel dataset consisting of 55 pairs of real haze free and nonhomogeneous hazy images recorded outdoor.
no code implementations • 5 May 2020 • Dario Fuoli, Zhiwu Huang, Martin Danelljan, Radu Timofte, Hua Wang, Longcun Jin, Dewei Su, Jing Liu, Jaehoon Lee, Michal Kudelski, Lukasz Bala, Dmitry Hrybov, Marcin Mozejko, Muchen Li, Si-Yao Li, Bo Pang, Cewu Lu, Chao Li, Dongliang He, Fu Li, Shilei Wen
For track 2, some existing methods are evaluated, showing promising solutions to the weakly-supervised video quality mapping problem.
no code implementations • 3 May 2020 • Kai Zhang, Shuhang Gu, Radu Timofte, Taizhang Shang, Qiuju Dai, Shengchen Zhu, Tong Yang, Yandong Guo, Younghyun Jo, Sejong Yang, Seon Joo Kim, Lin Zha, Jiande Jiang, Xinbo Gao, Wen Lu, Jing Liu, Kwangjin Yoon, Taegyun Jeon, Kazutoshi Akita, Takeru Ooba, Norimichi Ukita, Zhipeng Luo, Yuehan Yao, Zhenyu Xu, Dongliang He, Wenhao Wu, Yukang Ding, Chao Li, Fu Li, Shilei Wen, Jianwei Li, Fuzhi Yang, Huan Yang, Jianlong Fu, Byung-Hoon Kim, JaeHyun Baek, Jong Chul Ye, Yuchen Fan, Thomas S. Huang, Junyeop Lee, Bokyeung Lee, Jungki Min, Gwantae Kim, Kanghyu Lee, Jaihyun Park, Mykola Mykhailych, Haoyu Zhong, Yukai Shi, Xiaojun Yang, Zhijing Yang, Liang Lin, Tongtong Zhao, Jinjia Peng, Huibing Wang, Zhi Jin, Jiahao Wu, Yifu Chen, Chenming Shang, Huanrong Zhang, Jeongki Min, Hrishikesh P. S, Densen Puthussery, Jiji C. V
This paper reviews the NTIRE 2020 challenge on perceptual extreme super-resolution with focus on proposed solutions and results.
3 code implementations • 23 Apr 2020 • Hongxuan Tang, Hongyu Li, Jing Liu, Yu Hong, Hua Wu, Haifeng Wang
Machine reading comprehension (MRC) is a crucial task in natural language processing and has achieved remarkable advancements.
no code implementations • 14 Apr 2020 • Yaoxin Li, Jing Liu, Guozheng Lin, Yueyuan Hou, Muyun Mou, Jiang Zhang
In computer science, there exist a large number of optimization problems defined on graphs, that is to find a best node state configuration or a network structure such that the designed objective function is optimized under some constraints.
no code implementations • CVPR 2020 • Longteng Guo, Jing Liu, Xinxin Zhu, Peng Yao, Shichen Lu, Hanqing Lu
First, we propose Normalized Self-Attention (NSA), a reparameterization of SA that brings the benefits of normalization inside SA.
no code implementations • 13 Mar 2020 • Jielei Chu, Jing Liu, Hongjun Wang, Meng Hua, Zhiguo Gong, Tianrui Li
To explore the representation learning capability under the continuous stimulation of the SPI, we present a deep Micro-supervised Disturbance Learning (Micro-DL) framework based on the Micro-DGRBM and Micro-DRBM models and compare it with a similar deep structure which has not any external stimulation.
3 code implementations • ECCV 2020 • Shoukai Xu, Haokun Li, Bohan Zhuang, Jing Liu, JieZhang Cao, Chuangrun Liang, Mingkui Tan
More critically, our method achieves much higher accuracy on 4-bit quantization than the existing data free quantization method.
Ranked #2 on Data Free Quantization on CIFAR-100
no code implementations • 3 Mar 2020 • Shanchao Yang, Jing Liu, Kai Wu, Mingming Li
Differently, in this paper, we are interested in a novel problem named Time Series Conditioned Graph Generation: given an input multivariate time series, we aim to infer a target relation graph modeling the underlying interrelationships between time series with each node corresponding to each time series.
1 code implementation • 4 Jan 2020 • Jing Liu, Bohan Zhuang, Zhuangwei Zhuang, Yong Guo, Junzhou Huang, Jinhui Zhu, Mingkui Tan
In this paper, we propose a simple-yet-effective method called discrimination-aware channel pruning (DCP) to choose the channels that actually contribute to the discriminative power.
no code implementations • 15 Dec 2019 • Jianqing Jia, Semir Elezovikj, Heng Fan, Shuojin Yang, Jing Liu, Wei Guo, Chiu C. Tan, Haibin Ling
Our solution encodes the constraints for placing labels in an optimization problem to obtain the final label layout, and the labels will be placed in appropriate positions to reduce the chances of overlaying important real-world objects in street view AR scenarios.
3 code implementations • 6 Nov 2019 • Quan Wang, Pingping Huang, Haifeng Wang, Songtai Dai, Wenbin Jiang, Jing Liu, Yajuan Lyu, Yong Zhu, Hua Wu
This work presents Contextualized Knowledge Graph Embedding (CoKE), a novel paradigm that takes into account such contextual nature, and learns dynamic, flexible, and fully contextualized entity and relation embeddings.
no code implementations • ICCV 2019 • Jun Fu, Jing Liu, Yuhang Wang, Yong Li, Yongjun Bao, Jinhui Tang, Hanqing Lu
Recent works attempt to improve scene parsing performance by exploring different levels of contexts, and typically train a well-designed convolutional network to exploit useful contexts across all pixels equally.
Ranked #72 on Semantic Segmentation on ADE20K val
1 code implementation • WS 2019 • Hongyu Li, Xiyuan Zhang, Yibing Liu, Yiming Zhang, Quan Wang, Xiangyang Zhou, Jing Liu, Hua Wu, Haifeng Wang
In this paper, we introduce a simple system Baidu submitted for MRQA (Machine Reading for Question Answering) 2019 Shared Task that focused on generalization of machine reading comprehension (MRC) models.
no code implementations • 30 Oct 2019 • Jing Liu, Stefan Engblom, Carl Nettelblad
Modern Flash X-ray diffraction Imaging (FXI) acquires diffraction signals from single biomolecules at a high repetition rate from X-ray Free Electron Lasers (XFELs), easily obtaining millions of 2D diffraction patterns from a single experiment.
no code implementations • 17 Oct 2019 • Xinxin Zhu, Longteng Guo, Peng Yao, Shichen Lu, Wei Liu, Jing Liu
This report describes our solution for the VATEX Captioning Challenge 2020, which requires generating descriptions for the videos in both English and Chinese languages.
no code implementations • 16 Sep 2019 • Jing Liu, Fei Gao, Jiang Zhang
Many problems in real life can be converted to combinatorial optimization problems (COPs) on graphs, that is to find a best node state configuration or a network structure such that the designed objective function is optimized under some constraints.
no code implementations • 10 Aug 2019 • Bohan Zhuang, Jing Liu, Mingkui Tan, Lingqiao Liu, Ian Reid, Chunhua Shen
Furthermore, we propose a second progressive quantization scheme which gradually decreases the bit-width from high-precision to low-precision during training.
1 code implementation • 6 Aug 2019 • Longteng Guo, Jing Liu, Jinhui Tang, Jiangwei Li, Wei Luo, Hanqing Lu
Image captioning attempts to generate a sentence composed of several linguistic words, which are used to describe objects, attributes, and interactions in an image, denoted as visual semantic units in this paper.
no code implementations • 18 Jul 2019 • Jing Liu, Haidong Yuan, Xiao-Ming Lu, Xiaoguang Wang
Quantum Fisher information matrix (QFIM) is a core concept in theoretical quantum metrology due to the significant importance of quantum Cram\'{e}r-Rao bound in quantum parameter estimation.
Quantum Physics
1 code implementation • ACL 2019 • An Yang, Quan Wang, Jing Liu, Kai Liu, Yajuan Lyu, Hua Wu, Qiaoqiao She, Sujian Li
In this work, we investigate the potential of leveraging external knowledge bases (KBs) to further improve BERT for MRC.
no code implementations • 12 Jun 2019 • Jielei Chu, Hongjun Wang, Jing Liu, Zhiguo Gong, Tianrui Li
In mcrRBM and mcrGRBM models, the structure and multi-local collaborative relationships of unlabeled data are integrated into their encoding procedure.
no code implementations • 11 Jun 2019 • Duanshun Li, Jing Liu, Noseong Park, Dongeun Lee, Giridhar Ramachandran, Ali Seyedmazloom, Kookjin Lee, Chen Feng, Vadim Sokolov, Rajesh Ganesan
0-1 knapsack is of fundamental importance in computer science, business, operations research, etc.
no code implementations • 8 Mar 2019 • Tianwen Jiang, Sendong Zhao, Jing Liu, Jin-Ge Yao, Ming Liu, Bing Qin, Ting Liu, Chin-Yew Lin
Time-DS is composed of a time series instance-popularity and two strategies.
1 code implementation • 30 Dec 2018 • Zhang Zhang, Yi Zhao, Jing Liu, Shuo Wang, Ruyi Tao, Ruyue Xin, Jiang Zhang
We exhibit the universality of our framework on different kinds of time-series data: with the same structure, our model can be trained to accurately recover the network structure and predict future states on continuous, discrete, and binary dynamics, and outperforms competing network reconstruction methods.
no code implementations • 5 Dec 2018 • Jielei Chu, Hongjun Wang, Jing Liu, Zhiguo Gong, Tianrui Li
In this paper, we present a novel unsupervised feature learning architecture, which consists of a multi-clustering integration module and a variant of RBM termed multi-clustering integration RBM (MIRBM).
1 code implementation • NeurIPS 2018 • Zhuangwei Zhuang, Mingkui Tan, Bohan Zhuang, Jing Liu, Yong Guo, Qingyao Wu, Junzhou Huang, Jinhui Zhu
Channel pruning is one of the predominant approaches for deep model compression.
no code implementations • 25 Oct 2018 • Jing Liu, Gijs van der Schot, Stefan Engblom
It is also straightforward to parallelize them so as to fully match the XFEL repetition rate, thereby enabling processing at site.
no code implementations • CONLL 2018 • Feng Nie, Shuyan Zhou, Jing Liu, Jinpeng Wang, Chin-Yew Lin, Rong pan
The task of entity linking aims to identify concepts mentioned in a text fragments and link them to a reference knowledge base.
no code implementations • EMNLP 2018 • Xingwu Sun, Jing Liu, Yajuan Lyu, wei he, Yanjun Ma, Shi Wang
(2) The model copies the context words that are far from and irrelevant to the answer, instead of the words that are close and relevant to the answer.
12 code implementations • CVPR 2019 • Jun Fu, Jing Liu, Haijie Tian, Yong Li, Yongjun Bao, Zhiwei Fang, Hanqing Lu
Specifically, we append two types of attention modules on top of traditional dilated FCN, which model the semantic interdependencies in spatial and channel dimensions respectively.
Ranked #6 on Semantic Segmentation on Trans10K
no code implementations • COLING 2018 • Danqing Huang, Jing Liu, Chin-Yew Lin, Jian Yin
Experimental results show that (1) The copy and alignment mechanism is effective to address the two issues; (2) Reinforcement learning leads to better performance than maximum likelihood on this task; (3) Our neural model is complementary to the feature-based model and their combination significantly outperforms the state-of-the-art results.
no code implementations • WS 2018 • An Yang, Kai Liu, Jing Liu, Yajuan Lyu, Sujian Li
Current evaluation metrics to question answering based machine reading comprehension (MRC) systems generally focus on the lexical overlap between the candidate and reference answers, such as ROUGE and BLEU.