Search Results for author: Liqiang Nie

Found 120 papers, 68 papers with code

MMCoQA: Conversational Question Answering over Text, Tables, and Images

1 code implementation • ACL 2022 • Yongqi Li, Wenjie Li, Liqiang Nie

In this paper, we hence define a novel research task, i. e., multimodal conversational question answering (MMCoQA), aiming to answer users’ questions with multimodal knowledge sources via multi-turn conversations.

Benchmarking Conversational Question Answering +1

Paper
Code

MMGRec: Multimodal Generative Recommendation with Transformer Model

no code implementations • 25 Apr 2024 • Han Liu, Yinwei Wei, Xuemeng Song, Weili Guan, Yuan-Fang Li, Liqiang Nie

Multimodal recommendation aims to recommend user-preferred candidates based on her/his historically interacted items and associated multimodal information.

Paper
Add Code

Dynamic in Static: Hybrid Visual Correspondence for Self-Supervised Video Object Segmentation

1 code implementation • 21 Apr 2024 • Gensheng Pei, Yazhou Yao, Jianbo Jiao, Wenguan Wang, Liqiang Nie, Jinhui Tang

To achieve this objective, we present a unified self-supervised approach to learn visual representations of static-dynamic feature similarity.

Semantic Segmentation Video Object Segmentation +1

Paper
Code

FecTek: Enhancing Term Weight in Lexicon-Based Retrieval with Feature Context and Term-level Knowledge

no code implementations • 18 Apr 2024 • Zunran Wang, Zhonghua Li, Wei Shen, Qi Ye, Liqiang Nie

To effectively enrich the feature context representations of term weight, the Feature Context Module (FCM) is introduced, which leverages the power of BERT's representation to determine dynamic weights for each element in the embedding.

Contrastive Learning Retrieval +1

Paper
Add Code

Cluster-based Graph Collaborative Filtering

1 code implementation • 16 Apr 2024 • Fan Liu, Shuai Zhao, Zhiyong Cheng, Liqiang Nie, Mohan Kankanhalli

This model performs high-order graph convolution on cluster-specific graphs, which are constructed by capturing the multiple interests of users and identifying the common interests among them.

Clustering Collaborative Filtering +3

Paper
Code

LLMvsSmall Model? Large Language Model Based Text Augmentation Enhanced Personality Detection Model

no code implementations • 12 Mar 2024 • Linmei Hu, Hongyu He, Duokang Wang, Ziwang Zhao, Yingxia Shao, Liqiang Nie

Furthermore, we utilize the LLM to enrich the information of personality labels for enhancing the detection performance.

Contrastive Learning Language Modelling +2

Paper
Add Code

Discriminative Probing and Tuning for Text-to-Image Generation

no code implementations • 7 Mar 2024 • Leigang Qu, Wenjie Wang, Yongqi Li, Hanwang Zhang, Liqiang Nie, Tat-Seng Chua

We present a discriminative adapter built on T2I models to probe their discriminative abilities on two representative tasks and leverage discriminative fine-tuning to improve their text-image alignment.

Text-to-Image Generation

Paper
Add Code

WKVQuant: Quantizing Weight and Key/Value Cache for Large Language Models Gains More

no code implementations • 19 Feb 2024 • Yuxuan Yue, Zhihang Yuan, Haojie Duanmu, Sifan Zhou, Jianlong Wu, Liqiang Nie

Large Language Models (LLMs) face significant deployment challenges due to their substantial memory requirements and the computational demands of auto-regressive text generation process.

Quantization Text Generation

Paper
Add Code

Interactive Garment Recommendation with User in the Loop

no code implementations • 18 Feb 2024 • Federico Becattini, Xiaolin Chen, Andrea Puccia, Haokun Wen, Xuemeng Song, Liqiang Nie, Alberto del Bimbo

Recommending fashion items often leverages rich user profiles and makes targeted suggestions based on past history and previous purchases.

reinforcement-learning

Paper
Add Code

Generative Cross-Modal Retrieval: Memorizing Images in Multimodal Language Models for Retrieval and Beyond

no code implementations • 16 Feb 2024 • Yongqi Li, Wenjie Wang, Leigang Qu, Liqiang Nie, Wenjie Li, Tat-Seng Chua

Building upon this capability, we propose to enable multimodal large language models (MLLMs) to memorize and recall images within their parameters.

Cross-Modal Retrieval Retrieval

Paper
Add Code

Distillation Enhanced Generative Retrieval

no code implementations • 16 Feb 2024 • Yongqi Li, Zhen Zhang, Wenjie Wang, Liqiang Nie, Wenjie Li, Tat-Seng Chua

Generative retrieval is a promising new paradigm in text retrieval that generates identifier strings of relevant passages as the retrieval target.

Retrieval Text Retrieval

Paper
Add Code

Sentiment-enhanced Graph-based Sarcasm Explanation in Dialogue

no code implementations • 6 Feb 2024 • Kun Ouyang, Liqiang Jing, Xuemeng Song, Meng Liu, Yupeng Hu, Liqiang Nie

Although existing studies have achieved great success based on the generative pretrained language model BART, they overlook exploiting the sentiments residing in the utterance, video and audio, which are vital clues for sarcasm explanation.

Explanation Generation Language Modelling +1

Paper
Add Code

GliDe with a CaPE: A Low-Hassle Method to Accelerate Speculative Decoding

no code implementations • 3 Feb 2024 • Cunxiao Du, Jing Jiang, Xu Yuanchen, Jiawei Wu, Sicheng Yu, Yongqi Li, Shenggui Li, Kai Xu, Liqiang Nie, Zhaopeng Tu, Yang You

Speculative decoding is a relatively new decoding framework that leverages small and efficient draft models to reduce the latency of LLMs.

Paper
Add Code

Diffusion Facial Forgery Detection

1 code implementation • 29 Jan 2024 • Harry Cheng, Yangyang Guo, Tianyi Wang, Liqiang Nie, Mohan Kankanhalli

In particular, this dataset leverages 30, 000 carefully collected textual and visual prompts, ensuring the synthesis of images with both high fidelity and semantic consistency.

Image Generation

Paper
Code

Spatial Structure Constraints for Weakly Supervised Semantic Segmentation

1 code implementation • 20 Jan 2024 • Tao Chen, Yazhou Yao, Xingguo Huang, Zechao Li, Liqiang Nie, Jinhui Tang

In this paper, we propose spatial structure constraints (SSC) for weakly supervised semantic segmentation to alleviate the unwanted object over-activation of attention expansion.

Object Object Localization +2

Paper
Code

Enhancing Emotional Generation Capability of Large Language Models via Emotional Chain-of-Thought

no code implementations • 12 Jan 2024 • Zaijing Li, Gongwei Chen, Rui Shao, Dongmei Jiang, Liqiang Nie

In this paper, we propose the Emotional Chain-of-Thought (ECoT), a plug-and-play prompting method that enhances the performance of LLMs on various emotional generation tasks by aligning with human emotional intelligence guidelines.

Emotional Intelligence Emotion Recognition +1

Paper
Add Code

Understanding Before Recommendation: Semantic Aspect-Aware Review Exploitation via Large Language Models

no code implementations • 26 Dec 2023 • Fan Liu, Yaqi Liu, Zhiyong Cheng, Liqiang Nie, Mohan Kankanhalli

Recommendation systems harness user-item interactions like clicks and reviews to learn their representations.

Recommendation Systems

Paper
Add Code

Attribute-driven Disentangled Representation Learning for Multimodal Recommendation

no code implementations • 22 Dec 2023 • Zhenyang Li, Fan Liu, Yinwei Wei, Zhiyong Cheng, Liqiang Nie, Mohan Kankanhalli

To obtain robust and independent representations for each factor associated with a specific attribute, we first disentangle the representations of features both within and across different modalities.

Attribute Multimodal Recommendation +1

Paper
Add Code

VK-G2T: Vision and Context Knowledge enhanced Gloss2Text

no code implementations • 15 Dec 2023 • Liqiang Jing, Xuemeng Song, Xinxing Zu, Na Zheng, Zhongzhou Zhao, Liqiang Nie

Existing sign language translation methods follow a two-stage pipeline: first converting the sign language video to a gloss sequence (i. e. Sign2Gloss) and then translating the generated gloss sequence into a spoken language sentence (i. e. Gloss2Text).

Sentence Sign Language Translation +1

Paper
Add Code

Unsupervised Temporal Action Localization via Self-paced Incremental Learning

1 code implementation • 12 Dec 2023 • Haoyu Tang, Han Jiang, Mingzhu Xu, Yupeng Hu, Jihua Zhu, Liqiang Nie

Thereafter, we design two (constant- and variable- speed) incremental instance learning strategies for easy-to-hard model training, thus ensuring the reliability of these video pseudolabels and further improving overall localization performance.

Clustering Incremental Learning +3

Paper
Code

GaussianAvatar: Towards Realistic Human Avatar Modeling from a Single Video via Animatable 3D Gaussians

1 code implementation • 4 Dec 2023 • Liangxiao Hu, Hongwen Zhang, Yuxiang Zhang, Boyao Zhou, Boning Liu, Shengping Zhang, Liqiang Nie

We present GaussianAvatar, an efficient approach to creating realistic human avatars with dynamic 3D appearances from a single video.

Motion Estimation

276

Paper
Code

GPS-Gaussian: Generalizable Pixel-wise 3D Gaussian Splatting for Real-time Human Novel View Synthesis

1 code implementation • 4 Dec 2023 • Shunyuan Zheng, Boyao Zhou, Ruizhi Shao, Boning Liu, Shengping Zhang, Liqiang Nie, Yebin Liu

We present a new approach, termed GPS-Gaussian, for synthesizing novel views of a character in a real-time manner.

2k Depth Estimation +1

407

Paper
Code

RTQ: Rethinking Video-language Understanding Based on Image-text Model

2 code implementations • 1 Dec 2023 • Xiao Wang, Yaoyu Li, Tian Gan, Zheng Zhang, Jingjing Lv, Liqiang Nie

Recent advancements in video-language understanding have been established on the foundation of image-text models, resulting in promising outcomes due to the shared knowledge between images and videos.

Ranked #9 on Video Retrieval on MSR-VTT-1kA

Video Captioning Video Question Answering +1

Paper
Code

Generating Human-Centric Visual Cues for Human-Object Interaction Detection via Large Vision-Language Models

no code implementations • 26 Nov 2023 • Yu-Wei Zhan, Fan Liu, Xin Luo, Liqiang Nie, Xin-Shun Xu, Mohan Kankanhalli

To capitalize on these rich Human-Centric Visual Cues, we propose a novel approach named HCVC for HOI detection.

Human-Object Interaction Detection

Paper
Add Code

LION : Empowering Multimodal Large Language Model with Dual-Level Visual Knowledge

1 code implementation • 20 Nov 2023 • Gongwei Chen, Leyang Shen, Rui Shao, Xiang Deng, Liqiang Nie

1) Progressive incorporation of fine-grained spatial-aware visual knowledge.

Language Modelling Large Language Model

100

Paper
Code

An Empirical Study of Frame Selection for Text-to-Video Retrieval

no code implementations • 1 Nov 2023 • Mengxia Wu, Min Cao, Yang Bai, Ziyin Zeng, Chen Chen, Liqiang Nie, Min Zhang

In this paper, we make the first empirical study of frame selection for TVR.

Retrieval Text to Video Retrieval +1

Paper
Add Code

UNK-VQA: A Dataset and a Probe into the Abstention Ability of Multi-modal Large Models

1 code implementation • 17 Oct 2023 • Yangyang Guo, Fangkai Jiao, Zhiqi Shen, Liqiang Nie, Mohan Kankanhalli

Teaching Visual Question Answering (VQA) models to refrain from answering unanswerable questions is necessary for building a trustworthy AI system.

Attribute Question Answering +1

Paper
Code

Uncovering Hidden Connections: Iterative Tracking and Reasoning for Video-grounded Dialog

no code implementations • 11 Oct 2023 • Haoyu Zhang, Meng Liu, YaoWei Wang, Da Cao, Weili Guan, Liqiang Nie

In response to this gap, we present an iterative tracking and reasoning strategy that amalgamates a textual encoder, a visual encoder, and a generator.

Question Answering Response Generation +1

Paper
Add Code

ELIP: Efficient Language-Image Pre-training with Fewer Vision Tokens

1 code implementation • 28 Sep 2023 • Yangyang Guo, Haoyu Zhang, Yongkang Wong, Liqiang Nie, Mohan Kankanhalli

Learning a versatile language-image model is computationally prohibitive under a limited computing budget.

Cross-Modal Retrieval Image Captioning +1

Paper
Code

Detecting and Grounding Multi-Modal Media Manipulation and Beyond

1 code implementation • 25 Sep 2023 • Rui Shao, Tianxing Wu, Jianlong Wu, Liqiang Nie, Ziwei Liu

HAMMER performs 1) manipulation-aware contrastive learning between two uni-modal encoders as shallow manipulation reasoning, and 2) modality-aware cross-attention by multi-modal aggregator as deep manipulation reasoning.

Binary Classification Contrastive Learning +4

279

Paper
Code

Building Emotional Support Chatbots in the Era of LLMs

no code implementations • 17 Aug 2023 • Zhonghua Zheng, Lizi Liao, Yang Deng, Liqiang Nie

The integration of emotional support into various conversational scenarios presents profound societal benefits, such as social interactions, mental health counseling, and customer service.

In-Context Learning Navigate

Paper
Add Code

Temporal Sentence Grounding in Streaming Videos

1 code implementation • 14 Aug 2023 • Tian Gan, Xiao Wang, Yan Sun, Jianlong Wu, Qingpei Guo, Liqiang Nie

The goal of TSGSV is to evaluate the relevance between a video stream and a given sentence query.

Sentence Temporal Sentence Grounding

Paper
Code

LayoutLLM-T2I: Eliciting Layout Guidance from LLM for Text-to-Image Generation

no code implementations • 9 Aug 2023 • Leigang Qu, Shengqiong Wu, Hao Fei, Liqiang Nie, Tat-Seng Chua

Afterward, we propose a fine-grained object-interaction diffusion method to synthesize high-faithfulness images conditioned on the prompt and the automatically generated layout.

In-Context Learning Text-to-Image Generation

Paper
Add Code

Semantic-Guided Feature Distillation for Multimodal Recommendation

1 code implementation • 6 Aug 2023 • Fan Liu, Huilin Chen, Zhiyong Cheng, Liqiang Nie, Mohan Kankanhalli

The teacher model first extracts rich modality features from the generic modality feature by considering both the semantic information of items and the complementary information of multiple modalities.

Multimodal Recommendation Representation Learning

Paper
Code

StyleEDL: Style-Guided High-order Attention Network for Image Emotion Distribution Learning

1 code implementation • 6 Aug 2023 • Peiguang Jing, Xianyi Liu, Ji Wang, Yinwei Wei, Liqiang Nie, Yuting Su

Emotion distribution learning has gained increasing attention with the tendency to express emotions through images.

Paper
Code

Sample Less, Learn More: Efficient Action Recognition via Frame Feature Restoration

1 code implementation • 27 Jul 2023 • Harry Cheng, Yangyang Guo, Liqiang Nie, Zhiyong Cheng, Mohan Kankanhalli

Training an effective video action recognition model poses significant computational challenges, particularly under limited resource budgets.

Action Recognition Temporal Action Localization

Paper
Code

Towards Generalizable Deepfake Detection by Primary Region Regularization

no code implementations • 24 Jul 2023 • Harry Cheng, Yangyang Guo, Tianyi Wang, Liqiang Nie, Mohan Kankanhalli

The existing deepfake detection methods have reached a bottleneck in generalizing to unseen forgeries and manipulation approaches.

DeepFake Detection Face Swapping

Paper
Add Code

General Debiasing for Multimodal Sentiment Analysis

1 code implementation • 20 Jul 2023 • Teng Sun, Juntong Ni, Wenjie Wang, Liqiang Jing, Yinwei Wei, Liqiang Nie

To this end, we propose a general debiasing framework based on Inverse Probability Weighting (IPW), which adaptively assigns small weights to the samples with larger bias (i. e., the severer spurious correlations).

Multimodal Sentiment Analysis

Paper
Code

LightGT: A Light Graph Transformer for Multimedia Recommendation

1 code implementation • SIGIR '23: Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval 2023 • Yinwei Wei, Wenqi Liu, Fan Liu, Xiang Wang, Liqiang Nie, Tat-Seng Chua

Considering its challenges in effectiveness and efficiency, we propose a novel Transformer-based recommendation model, termed as Light Graph Transformer model (LightGT).

Ranked #1 on Multi-Media Recommendation on Kwai (Recall@10 metric)

Collaborative Filtering Microvideo Recommendation +4

Paper
Code

Multi-source Semantic Graph-based Multimodal Sarcasm Explanation Generation

1 code implementation • 29 Jun 2023 • Liqiang Jing, Xuemeng Song, Kun Ouyang, Mengzhao Jia, Liqiang Nie

Multimodal Sarcasm Explanation (MuSE) is a new yet challenging task, which aims to generate a natural language sentence for a multimodal social post (an image as well as its caption) to explain why it contains sarcasm.

Explanation Generation Object +1

Paper
Code

A Survey on Video Moment Localization

no code implementations • 13 Jun 2023 • Meng Liu, Liqiang Nie, Yunxiao Wang, Meng Wang, Yong Rui

Video moment localization, also known as video moment retrieval, aiming to search a target segment within a video described by a given natural language query.

Moment Retrieval Retrieval +1

Paper
Add Code

Learning Geometric Transformation for Point Cloud Completion

2 code implementations • International Journal of Computer Vision 2023 • Shengping Zhang, Xianzhu Liu, Haozhe Xie, Liqiang Nie, Huiyu Zhou, DaCheng Tao, Xuelong Li

It exploits the repetitive geometric structures in common 3D objects to recover the complete shapes, which contains three sub-networks: geometric patch network, structure transformation network, and detail refinement network.

Ranked #4 on Point Cloud Completion on ShapeNet

Point Cloud Completion

Paper
Code

DeepFake-Adapter: Dual-Level Adapter for DeepFake Detection

1 code implementation • 1 Jun 2023 • Rui Shao, Tianxing Wu, Liqiang Nie, Ziwei Liu

Unlike existing deepfake detection methods merely focusing on low-level forgery patterns, the forgery detection process of our model can be regularized by generalizable high-level semantics from a pre-trained ViT and adapted by global and local low-level forgeries of deepfake data.

DeepFake Detection Face Swapping

Paper
Code

RaSa: Relation and Sensitivity Aware Representation Learning for Text-based Person Search

1 code implementation • 23 May 2023 • Yang Bai, Min Cao, Daming Gao, Ziqiang Cao, Chen Chen, Zhenfeng Fan, Liqiang Nie, Min Zhang

RA offsets the overfitting risk by introducing a novel positive relation detection task (i. e., learning to distinguish strong and weak positive pairs).

Ranked #2 on Text based Person Retrieval on RSTPReid

Person Search Relation +2

Paper
Code

Text-based Person Search without Parallel Image-Text Data

no code implementations • 22 May 2023 • Yang Bai, Jingyao Wang, Min Cao, Chen Chen, Ziqiang Cao, Liqiang Nie, Min Zhang

Text-based person search (TBPS) aims to retrieve the images of the target person from a large image gallery based on a given natural language description.

Image Captioning Language Modelling +4

Paper
Add Code

Dual Semantic Knowledge Composed Multimodal Dialog Systems

no code implementations • 17 May 2023 • Xiaolin Chen, Xuemeng Song, Yinwei Wei, Liqiang Nie, Tat-Seng Chua

Thereafter, considering that the attribute knowledge and relation knowledge can benefit the responding to different levels of questions, we design a multi-level knowledge composition module in MDS-S2 to obtain the latent composed response representation.

Attribute Relation +1

Paper
Add Code

Stylized Data-to-Text Generation: A Case Study in the E-Commerce Domain

no code implementations • 5 May 2023 • Liqiang Jing, Xuemeng Song, Xuming Lin, Zhongzhou Zhao, Wei Zhou, Liqiang Nie

This task is non-trivial, due to three challenges: the logic of the generated text, unstructured style reference, and biased training samples.

Attribute Data-to-Text Generation

Paper
Add Code

Learnable Pillar-based Re-ranking for Image-Text Retrieval

1 code implementation • 25 Apr 2023 • Leigang Qu, Meng Liu, Wenjie Wang, Zhedong Zheng, Liqiang Nie, Tat-Seng Chua

Image-text retrieval aims to bridge the modality gap and retrieve cross-modal content based on semantic similarities.

Re-Ranking Retrieval +1

Paper
Code

ChatLLM Network: More brains, More intelligence

no code implementations • 24 Apr 2023 • Rui Hao, Linmei Hu, Weijian Qi, Qingliu Wu, Yirui Zhang, Liqiang Nie

Dialogue-based language models mark a huge milestone in the field of artificial intelligence, by their impressive ability to interact with users, as well as a series of challenging tasks prompted by customized instructions.

Decision Making

Paper
Add Code

Rethinking Context Aggregation in Natural Image Matting

1 code implementation • 3 Apr 2023 • Qinglin Liu, Shengping Zhang, Quanling Meng, Ru Li, Bineng Zhong, Liqiang Nie

For natural image matting, context information plays a crucial role in estimating alpha mattes especially when it is challenging to distinguish foreground from its background.

Ranked #1 on Image Matting on Composition-1K

Image Matting

Paper
Code

Learning Reliable Representations for Incomplete Multi-View Partial Multi-Label Classification

no code implementations • 30 Mar 2023 • Chengliang Liu, Jie Wen, Yong Xu, Liqiang Nie, Min Zhang

The application of multi-view contrastive learning has further facilitated this process, however, the existing multi-view contrastive learning methods crudely separate the so-called negative pair, which largely results in the separation of samples belonging to the same category or similar ones.

Classification Contrastive Learning +3

Paper
Add Code

Micro-video Tagging via Jointly Modeling Social Influence and Tag Relation

1 code implementation • 15 Mar 2023 • Xiao Wang, Tian Gan, Yinwei Wei, Jianlong Wu, Dai Meng, Liqiang Nie

Existing methods mostly focus on analyzing video content, neglecting users' social influence and tag relation.

Link Prediction Relation +3

Paper
Code

Efficient Image-Text Retrieval via Keyword-Guided Pre-Screening

no code implementations • 14 Mar 2023 • Min Cao, Yang Bai, Jingyao Wang, Ziqiang Cao, Liqiang Nie, Min Zhang

The proposed framework equipped with only two embedding layers achieves $O(1)$ querying time complexity, while improving the retrieval efficiency and keeping its performance, when applied prior to the common image-text retrieval methods.

Multi-Label Classification Multi-Task Learning +2

Paper
Add Code

Deep Learning and Medical Imaging for COVID-19 Diagnosis: A Comprehensive Survey

no code implementations • 13 Feb 2023 • Song Wu, Yazhou Ren, Aodi Yang, Xinyue Chen, Xiaorong Pu, Jing He, Liqiang Nie, Philip S. Yu

In this survey, we investigate the main contributions of deep learning applications using medical images in fighting against COVID-19 from the aspects of image classification, lesion localization, and severity quantification, and review different deep learning architectures and some image preprocessing techniques for achieving a preciser diagnosis.

COVID-19 Diagnosis Image Classification

Paper
Add Code

Learning to Agree on Vision Attention for Visual Commonsense Reasoning

no code implementations • 4 Feb 2023 • Zhenyang Li, Yangyang Guo, Kejie Wang, Fan Liu, Liqiang Nie, Mohan Kankanhalli

Visual Commonsense Reasoning (VCR) remains a significant yet challenging research problem in the realm of visual reasoning.

Visual Commonsense Reasoning

Paper
Add Code

HS-GCN: Hamming Spatial Graph Convolutional Networks for Recommendation

1 code implementation • 13 Jan 2023 • Han Liu, Yinwei Wei, Jianhua Yin, Liqiang Nie

Towards this end, existing methods tend to code users by modeling their Hamming similarities with the items they historically interact with, which are termed as the first-order similarities in this work.

Recommendation Systems

Paper
Code

CNVid-3.5M: Build, Filter, and Pre-Train the Large-Scale Public Chinese Video-Text Dataset

1 code implementation • CVPR 2023 • Tian Gan, Qing Wang, Xingning Dong, Xiangyuan Ren, Liqiang Nie, Qingpei Guo

Though there are certain methods studying the Chinese video-text pre-training, they pre-train their models on private datasets whose videos and text are unavailable.

Paper
Code

CHMATCH: Contrastive Hierarchical Matching and Robust Adaptive Threshold Boosted Semi-Supervised Learning

1 code implementation • CVPR 2023 • Jianlong Wu, Haozhe Yang, Tian Gan, Ning Ding, Feijun Jiang, Liqiang Nie

In the meantime, we make full use of the structured information in the hierarchical labels to learn an accurate affinity graph for contrastive learning.

Contrastive Learning

Paper
Code

Multi-queue Momentum Contrast for Microvideo-Product Retrieval

1 code implementation • 22 Dec 2022 • Yali Du, Yinwei Wei, Wei Ji, Fan Liu, Xin Luo, Liqiang Nie

The booming development and huge market of micro-videos bring new e-commerce channels for merchants.

Representation Learning Retrieval

Paper
Code

Causal Inference for Knowledge Graph based Recommendation

1 code implementation • 20 Dec 2022 • Yinwei Wei, Xiang Wang, Liqiang Nie, Shaoyu Li, Dingxian Wang, Tat-Seng Chua

Knowledge Graph (KG), as a side-information, tends to be utilized to supplement the collaborative filtering (CF) based recommendation model.

Collaborative Filtering counterfactual +1

Paper
Code

Multimodal Matching-aware Co-attention Networks with Mutual Knowledge Distillation for Fake News Detection

no code implementations • 12 Dec 2022 • Linmei Hu, Ziwang Zhao, Weijian Qi, Xuemeng Song, Liqiang Nie

Additionally, based on the designed image-text matching-aware co-attention mechanism, we propose to build two co-attention networks respectively centered on text and image for mutual knowledge distillation to improve fake news detection.

Fake News Detection Image-text matching +2

Paper
Add Code

A Survey of Knowledge Enhanced Pre-trained Language Models

no code implementations • 11 Nov 2022 • Linmei Hu, Zeyi Liu, Ziwang Zhao, Lei Hou, Liqiang Nie, Juanzi Li

We introduce appropriate taxonomies respectively for Natural Language Understanding (NLU) and Natural Language Generation (NLG) to highlight these two main tasks of NLP.

Natural Language Understanding Retrieval +2

Paper
Add Code

Privacy-Preserving Synthetic Data Generation for Recommendation Systems

1 code implementation • 27 Sep 2022 • Fan Liu, Zhiyong Cheng, Huilin Chen, Yinwei Wei, Liqiang Nie, Mohan Kankanhalli

At the item level, a synthetic data generation module is proposed to generate a synthetic item corresponding to the selected item based on the user's preferences.

Privacy Preserving Recommendation Systems +1

Paper
Code

Deep Convolutional Pooling Transformer for Deepfake Detection

no code implementations • 12 Sep 2022 • Tianyi Wang, Harry Cheng, Kam Pui Chow, Liqiang Nie

Most existing deep learning methods mainly focus on local features and relations within the face image using convolutional neural networks as a backbone.

DeepFake Detection Face Swapping +1

Paper
Add Code

Visual Perturbation-aware Collaborative Learning for Overcoming the Language Prior Problem

no code implementations • 24 Jul 2022 • Yudong Han, Liqiang Nie, Jianhua Yin, Jianlong Wu, Yan Yan

Several studies have recently pointed that existing Visual Question Answering (VQA) models heavily suffer from the language prior problem, which refers to capturing superficial statistical correlations between the question type and the answer whereas ignoring the image contents.

Question Answering Visual Question Answering

Paper
Add Code

Counterfactual Reasoning for Out-of-distribution Multimodal Sentiment Analysis

1 code implementation • 24 Jul 2022 • Teng Sun, Wenjie Wang, Liqiang Jing, Yiran Cui, Xuemeng Song, Liqiang Nie

Inspired by this, we devise a model-agnostic counterfactual framework for multimodal sentiment analysis, which captures the direct effect of textual modality via an extra text model and estimates the indirect one by a multimodal model.

counterfactual Counterfactual Inference +2

Paper
Code

Semantic-aware Modular Capsule Routing for Visual Question Answering

no code implementations • 21 Jul 2022 • Yudong Han, Jianhua Yin, Jianlong Wu, Yinwei Wei, Liqiang Nie

Visual Question Answering (VQA) is fundamentally compositional in nature, and many questions are simply answered by decomposing them into modular sub-problems.

Question Answering Visual Question Answering

Paper
Add Code

Multimodal Dialog Systems with Dual Knowledge-enhanced Generative Pretrained Language Model

no code implementations • 16 Jul 2022 • Xiaolin Chen, Xuemeng Song, Liqiang Jing, Shuo Li, Linmei Hu, Liqiang Nie

To address these limitations, we propose a novel dual knowledge-enhanced generative pretrained language model for multimodal task-oriented dialog systems (DKMD), consisting of three key components: dual knowledge selection, dual knowledge-enhanced context learning, and knowledge-enhanced response generation.

Language Modelling Response Generation

Paper
Add Code

Lipschitz Continuity Retained Binary Neural Network

1 code implementation • 13 Jul 2022 • Yuzhang Shang, Dan Xu, Bin Duan, Ziliang Zong, Liqiang Nie, Yan Yan

Relying on the premise that the performance of a binary neural network can be largely restored with eliminated quantization error between full-precision weight vectors and their corresponding binary vectors, existing works of network binarization frequently adopt the idea of model robustness to reach the aforementioned objective.

Binarization Quantization

Paper
Code

Network Binarization via Contrastive Learning

1 code implementation • 6 Jul 2022 • Yuzhang Shang, Dan Xu, Ziliang Zong, Liqiang Nie, Yan Yan

Neural network binarization accelerates deep models by quantizing their weights and activations into 1-bit.

Binarization Contrastive Learning +2

Paper
Code

A Unified End-to-End Retriever-Reader Framework for Knowledge-based VQA

1 code implementation • 30 Jun 2022 • Yangyang Guo, Liqiang Nie, Yongkang Wong, Yibing Liu, Zhiyong Cheng, Mohan Kankanhalli

On the other hand, pertaining to the implicit knowledge, the multi-modal implicit knowledge for knowledge-based VQA still remains largely unexplored.

Question Answering Retrieval +1

Paper
Code

User-controllable Recommendation Against Filter Bubbles

1 code implementation • 29 Apr 2022 • Wenjie Wang, Fuli Feng, Liqiang Nie, Tat-Seng Chua

both accuracy and diversity.

Blocking counterfactual +3

Paper
Code

Image-text Retrieval: A Survey on Recent Research and Development

no code implementations • 28 Mar 2022 • Min Cao, Shiping Li, Juntao Li, Liqiang Nie, Min Zhang

On top of this, the efficiency-focused study on the ITR system is introduced as the third perspective.

Retrieval Text Retrieval

Paper
Add Code

Stacked Hybrid-Attention and Group Collaborative Learning for Unbiased Scene Graph Generation

1 code implementation • CVPR 2022 • Xingning Dong, Tian Gan, Xuemeng Song, Jianlong Wu, Yuan Cheng, Liqiang Nie

Scene Graph Generation, which generally follows a regular encoder-decoder pipeline, aims to first encode the visual contents within the given image and then parse them into a compact summary graph.

Ranked #1 on Unbiased Scene Graph Generation on Visual Genome (mR@20 metric)

Graph Generation Unbiased Scene Graph Generation

Paper
Code

Disentangled Multimodal Representation Learning for Recommendation

1 code implementation • 10 Mar 2022 • Fan Liu, Huilin Chen, Zhiyong Cheng, AnAn Liu, Liqiang Nie, Mohan Kankanhalli

However, existing methods ignore the fact that different modalities contribute differently towards a user's preference on various factors of an item.

Recommendation Systems Representation Learning

Paper
Code

Voice-Face Homogeneity Tells Deepfake

no code implementations • 4 Mar 2022 • Harry Cheng, Yangyang Guo, Tianyi Wang, Qi Li, Xiaojun Chang, Liqiang Nie

To this end, a voice-face matching method is devised to measure the matching degree of these two.

Paper
Add Code

MERIt: Meta-Path Guided Contrastive Learning for Logical Reasoning

1 code implementation • Findings (ACL) 2022 • Fangkai Jiao, Yangyang Guo, Xuemeng Song, Liqiang Nie

Logical reasoning is of vital importance to natural language understanding.

Ranked #3 on Reading Comprehension on ReClor

Contrastive Learning counterfactual +4

Paper
Code

On Modality Bias Recognition and Reduction

1 code implementation • 25 Feb 2022 • Yangyang Guo, Liqiang Nie, Harry Cheng, Zhiyong Cheng, Mohan Kankanhalli, Alberto del Bimbo

From the results on four datasets regarding the above three tasks, our method yields remarkable performance improvements compared with the baselines, demonstrating its superiority on reducing the modality bias problem.

Action Recognition Multi-modal Classification +3

Paper
Code

Joint Answering and Explanation for Visual Commonsense Reasoning

1 code implementation • 25 Feb 2022 • Zhenyang Li, Yangyang Guo, Kejie Wang, Yinwei Wei, Liqiang Nie, Mohan Kankanhalli

Given that our framework is model-agnostic, we apply it to the existing popular baselines and validate its effectiveness on the benchmark dataset.

Knowledge Distillation Question Answering +2

Paper
Code

Win the Lottery Ticket via Fourier Analysis: Frequencies Guided Network Pruning

no code implementations • 30 Jan 2022 • Yuzhang Shang, Bin Duan, Ziliang Zong, Liqiang Nie, Yan Yan

Extensive experiments on CIFAR-10 and CIFAR-100 demonstrate the superiority of our novel Fourier analysis based MBP compared to other traditional MBP algorithms.

Knowledge Distillation Network Pruning

Paper
Add Code

Learning Robust Recommender from Noisy Implicit Feedback

1 code implementation • 2 Dec 2021 • Wenjie Wang, Fuli Feng, Xiangnan He, Liqiang Nie, Tat-Seng Chua

Inspired by this observation, we propose a new training strategy named Adaptive Denoising Training (ADT), which adaptively prunes the noisy interactions by two paradigms (i. e., Truncated Loss and Reweighted Loss).

Denoising Recommendation Systems

Paper
Code

Hierarchical Deep Residual Reasoning for Temporal Moment Localization

1 code implementation • 31 Oct 2021 • Ziyang Ma, Xianjing Han, Xuemeng Song, Yiran Cui, Liqiang Nie

Temporal Moment Localization (TML) in untrimmed videos is a challenging task in the field of multimedia, which aims at localizing the start and end points of the activity in the video, described by a sentence query.

Language-Based Temporal Localization Sentence

Paper
Code

Multi-Modal Interaction Graph Convolutional Network for Temporal Language Localization in Videos

1 code implementation • 12 Oct 2021 • Zongmeng Zhang, Xianjing Han, Xuemeng Song, Yan Yan, Liqiang Nie

Towards this end, in this work, we propose a Multi-modal Interaction Graph Convolutional Network (MIGCN), which jointly explores the complex intra-modal relations and inter-modal interactions residing in the video and sentence query to facilitate the understanding and semantic correspondence capture of the video and sentence query.

Semantic correspondence Semantic Similarity +2

Paper
Code

Contrastive Mutual Information Maximization for Binary Neural Networks

no code implementations • 29 Sep 2021 • Yuzhang Shang, Dan Xu, Ziliang Zong, Liqiang Nie, Yan Yan

Neural network binarization accelerates deep models by quantizing their weights and activations into 1-bit.

Binarization Contrastive Learning +2

Paper
Add Code

Lipschitz Continuity Guided Knowledge Distillation

no code implementations • ICCV 2021 • Yuzhang Shang, Bin Duan, Ziliang Zong, Liqiang Nie, Yan Yan

Knowledge distillation has become one of the most important model compression techniques by distilling knowledge from larger teacher networks to smaller student ones.

Knowledge Distillation Model Compression +2

Paper
Add Code

When Product Search Meets Collaborative Filtering: A Hierarchical Heterogeneous Graph Neural Network Approach

no code implementations • 17 Aug 2021 • Xiangkun Yin, Yangyang Guo, Liqiang Nie, Zhiyong Cheng

In addition, we empirically prove that collaborative filtering and semantic matching are complementary to each other in product search performance enhancement.

Collaborative Filtering Representation Learning +1

Paper
Add Code

Contrastive Learning for Cold-Start Recommendation

1 code implementation • 12 Jul 2021 • Yinwei Wei, Xiang Wang, Qi Li, Liqiang Nie, Yan Li, Xuanping Li, Tat-Seng Chua

It aims to maximize the mutual dependencies between item content and collaborative signals.

Contrastive Learning Recommendation Systems +1

Paper
Code

Dynamic Modality Interaction Modeling for Image-Text Retrieval

1 code implementation • ACM Special Interest Group on Information Retrieval 2021 • Leigang Qu, Meng Liu, Jianlong Wu, Zan Gao, Liqiang Nie

To address these issues, we develop a novel modality interaction modeling network based upon the routing mechanism, which is the first unified and dynamic multimodal interaction framework towards image-text retrieval.

Cross-Modal Retrieval Information Retrieval +2

Paper
Code

Review Polarity-wise Recommender

1 code implementation • 8 Jun 2021 • Han Liu, Yangyang Guo, Jianhua Yin, Zan Gao, Liqiang Nie

To be specific, in this model, positive and negative reviews are separately gathered and utilized to model the user-preferred and user-rejected aspects, respectively.

Recommendation Systems

Paper
Code

REPT: Bridging Language Models and Machine Reading Comprehension via Retrieval-Based Pre-training

1 code implementation • Findings (ACL) 2021 • Fangkai Jiao, Yangyang Guo, Yilin Niu, Feng Ji, Feng-Lin Li, Liqiang Nie

Pre-trained Language Models (PLMs) have achieved great success on Machine Reading Comprehension (MRC) over the past few years.

Machine Reading Comprehension Retrieval

Paper
Code

AdaVQA: Overcoming Language Priors with Adapted Margin Cosine Loss

1 code implementation • 5 May 2021 • Yangyang Guo, Liqiang Nie, Zhiyong Cheng, Feng Ji, Ji Zhang, Alberto del Bimbo

Experimental results demonstrate that our adapted margin cosine loss can greatly enhance the baseline models with an absolute performance gain of 15\% on average, strongly verifying the potential of tackling the language prior problem in VQA from the angle of the answer feature space learning.

Question Answering Visual Question Answering

Paper
Code

A Graph-guided Multi-round Retrieval Method for Conversational Open-domain Question Answering

no code implementations • 17 Apr 2021 • Yongqi Li, Wenjie Li, Liqiang Nie

Moreover, in order to collect more complementary information in the historical context, we also propose to incorporate the multi-round relevance feedback technique to explore the impact of the retrieval context on current question understanding.

Conversational Question Answering Open-Domain Question Answering +1

Paper
Add Code

Graph Contrastive Clustering

1 code implementation • ICCV 2021 • Huasong Zhong, Jianlong Wu, Chong Chen, Jianqiang Huang, Minghua Deng, Liqiang Nie, Zhouchen Lin, Xian-Sheng Hua

On the other hand, a novel graph-based contrastive learning strategy is proposed to learn more compact clustering assignments.

Clustering Contrastive Learning

Paper
Code

Feature-level Attentive ICF for Recommendation

1 code implementation • 22 Feb 2021 • Zhiyong Cheng, Fan Liu, Shenghan Mei, Yangyang Guo, Lei Zhu, Liqiang Nie

To demonstrate the effectiveness of our method, we design a light attention neural network to integrate both item-level and feature-level attention for neural ICF models.

Collaborative Filtering Recommendation Systems

Paper
Code

Interest-aware Message-Passing GCN for Recommendation

1 code implementation • 19 Feb 2021 • Fan Liu, Zhiyong Cheng, Lei Zhu, Zan Gao, Liqiang Nie

To form the subgraphs, we design an unsupervised subgraph generation module, which can effectively identify users with common interests by exploiting both user feature and graph structure.

Paper
Code

Answer Questions with Right Image Regions: A Visual Attention Regularization Approach

1 code implementation • 3 Feb 2021 • Yibing Liu, Yangyang Guo, Jianhua Yin, Xuemeng Song, Weifeng Liu, Liqiang Nie

However, recent studies have pointed out that the highlighted image regions from the visual attention are often irrelevant to the given question and answer, leading to model confusion for correct visual reasoning.

Question Answering Visual Grounding +2

Paper
Code

Incremental Knowledge Based Question Answering

no code implementations • 18 Jan 2021 • Yongqi Li, Wenjie Li, Liqiang Nie

In the past years, Knowledge-Based Question Answering (KBQA), which aims to answer natural language questions using facts in a knowledge base, has been well developed.

Incremental Learning Knowledge Distillation +1

Paper
Add Code

Market2Dish: Health-aware Food Recommendation

1 code implementation • 11 Dec 2020 • Wenjie Wang, Ling-Yu Duan, Hao Jiang, Peiguang Jing, Xuemeng Song, Liqiang Nie

With the rising incidence of some diseases, such as obesity and diabetes, a healthy diet is arousing increasing attention.

Food recommendation Nutrition +1

Paper
Code

Loss re-scaling VQA: Revisiting the LanguagePrior Problem from a Class-imbalance View

1 code implementation • 30 Oct 2020 • Yangyang Guo, Liqiang Nie, Zhiyong Cheng, Qi Tian, Min Zhang

Concretely, we design a novel interpretation scheme whereby the loss of mis-predicted frequent and sparse answers of the same question type is distinctly exhibited during the late training phase.

Face Recognition Image Classification +2

Paper
Code

Enhancing Factorization Machines with Generalized Metric Learning

1 code implementation • 20 Jun 2020 • Yangyang Guo, Zhiyong Cheng, Jiazheng Jing, Yanpeng Lin, Liqiang Nie, Meng Wang

Traditional FMs adopt the inner product to model the second-order interactions between different attributes, which are represented via feature vectors.

Attribute Metric Learning +1

Paper
Code

Denoising Implicit Feedback for Recommendation

1 code implementation • 7 Jun 2020 • Wenjie Wang, Fuli Feng, Xiangnan He, Liqiang Nie, Tat-Seng Chua

In this work, we explore the central theme of denoising implicit feedback for recommender training.

Denoising Recommendation Systems

Paper
Code

A^2-GCN: An Attribute-aware Attentive GCN Model for Recommendation

no code implementations • 20 Mar 2020 • Fan Liu, Zhiyong Cheng, Lei Zhu, Chenghao Liu, Liqiang Nie

Considering the fact that for different users, the attributes of an item have different influence on their preference for this item, we design a novel attention mechanism to filter the message passed from an item to a target user by considering the attribute information.

Attribute Recommendation Systems

Paper
Add Code

Improving Distantly-Supervised Relation Extraction with Joint Label Embedding

no code implementations • IJCNLP 2019 • Linmei Hu, Luhao Zhang, Chuan Shi, Liqiang Nie, Weili Guan, Cheng Yang

Distantly-supervised relation extraction has proven to be effective to find relational facts from texts.

Knowledge Graphs Relation +2

Paper
Add Code

MMGCN: Multi-modal Graph Convolution Network for Personalized Recommendation of Micro-video

1 code implementation • ACM International Conference on Multimedia 2019 • Yinwei Wei, Xiang Wang, Liqiang Nie, Xiangnan He, Richang Hong, Tat-Seng Chua

Existing works on multimedia recommendation largely exploit multi-modal contents to enrich item representations, while less effort is made to leverage information interchange between users and items to enhance user representations and further capture user's fine-grained preferences on different modalities.

Ranked #1 on Multi-Media Recommendation on MovieLens 10M

Microvideo Recommendation Micro-video recommendations +4

264

Paper
Code

Personalized Hashtag Recommendation for Micro-videos

1 code implementation • 27 Aug 2019 • Yinwei Wei, Zhiyong Cheng, Xuzheng Yu, Zhou Zhao, Lei Zhu, Liqiang Nie

The hashtags, that a user provides to a post (e. g., a micro-video), are the ones which in her mind can well describe the post content where she is interested in.

Paper
Code

User Diverse Preference Modeling by Multimodal Attentive Metric Learning

1 code implementation • 21 Aug 2019 • Fan Liu, Zhiyong Cheng, Changchang Sun, Yinglong Wang, Liqiang Nie, Mohan Kankanhalli

To tackle this problem, in this paper, we propose a novel Multimodal Attentive Metric Learning (MAML) method to model user diverse preferences for various items.

Metric Learning Recommendation Systems

Paper
Code

Quantifying and Alleviating the Language Prior Problem in Visual Question Answering

1 code implementation • 13 May 2019 • Yangyang Guo, Zhiyong Cheng, Liqiang Nie, Yibing Liu, Yinglong Wang, Mohan Kankanhalli

Benefiting from the advancement of computer vision, natural language processing and information retrieval techniques, visual question answering (VQA), which aims to answer questions about an image or a video, has received lots of attentions over the past few years.

Information Retrieval Question Answering +2

Paper
Code

Explicit Interaction Model towards Text Classification

1 code implementation • 23 Nov 2018 • Cunxiao Du, Zhaozheng Chin, Fuli Feng, Lei Zhu, Tian Gan, Liqiang Nie

To address this problem, we introduce the interaction mechanism to incorporate word-level matching signals into the text classification task.

Ranked #4 on Text Classification on Yahoo! Answers

General Classification Multi Class Text Classification +3

Paper
Code

Learning to Ask Questions in Open-domain Conversational Systems with Typed Decoders

1 code implementation • ACL 2018 • Yansen Wang, Chen-Yi Liu, Minlie Huang, Liqiang Nie

Asking good questions in large-scale, open-domain conversational systems is quite significant yet rather untouched.

Question Generation Question-Generation +1

Paper
Code

Discrete Factorization Machines for Fast Feature-based Recommendation

1 code implementation • 6 May 2018 • Han Liu, Xiangnan He, Fuli Feng, Liqiang Nie, Rui Liu, Hanwang Zhang

In this paper, we develop a generic feature-based recommendation model, called Discrete Factorization Machine (DFM), for fast and accurate recommendation.

Binarization Quantization

Paper
Code

Neural Compatibility Modeling with Attentive Knowledge Distillation

no code implementations • 17 Apr 2018 • Xuemeng Song, Fuli Feng, Xianjing Han, Xin Yang, Wei Liu, Liqiang Nie

Nevertheless, existing studies overlook the rich valuable knowledge (rules) accumulated in fashion domain, especially the rules regarding clothing matching.

Image Classification Knowledge Distillation +2

Paper
Add Code

Neural Collaborative Filtering

43 code implementations • WWW 2017 • Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, Tat-Seng Chua

When it comes to model the key factor in collaborative filtering -- the interaction between user and item features, they still resorted to matrix factorization and applied an inner product on the latent features of users and items.

Collaborative Filtering Recommendation Systems

17,974

Paper
Code

Laplacian-Steered Neural Style Transfer

3 code implementations • 5 Jul 2017 • Shaohua Li, Xinxing Xu, Liqiang Nie, Tat-Seng Chua

However in the traditional optimization objective, low-level features of the content image are absent, and the low-level features of the style image dominate the low-level detail structures of the new image.

Image Generation Style Transfer

Paper
Code

Item Silk Road: Recommending Items from Information Domains to Social Users

no code implementations • 10 Jun 2017 • Xiang Wang, Xiangnan He, Liqiang Nie, Tat-Seng Chua

In this work, we address the problem of cross-domain social recommendation, i. e., recommending relevant items of information domains to potential users of social networks.

Ranked #2 on Recommendation Systems on WeChat

Collaborative Ranking Recommendation Systems

Paper
Add Code

Supervised Deep Hashing for Hierarchical Labeled Data

no code implementations • 7 Apr 2017 • Dan Wang, He-Yan Huang, Chi Lu, Bo-Si Feng, Liqiang Nie, Guihua Wen, Xian-Ling Mao

Specifically, we define a novel similarity formula for hierarchical labeled data by weighting each layer, and design a deep convolutional neural network to obtain a hash code for each data point.

Deep Hashing Image Retrieval

Paper
Add Code

Simple to Complex Cross-modal Learning to Rank

no code implementations • 4 Feb 2017 • Minnan Luo, Xiaojun Chang, Zhihui Li, Liqiang Nie, Alexander G. Hauptmann, Qinghua Zheng

The heterogeneity-gap between different modalities brings a significant challenge to multimedia information retrieval.

Cross-Modal Retrieval Information Retrieval +3

Paper
Add Code

SCA-CNN: Spatial and Channel-wise Attention in Convolutional Networks for Image Captioning

2 code implementations • CVPR 2017 • Long Chen, Hanwang Zhang, Jun Xiao, Liqiang Nie, Jian Shao, Wei Liu, Tat-Seng Chua

Existing visual attention models are generally spatial, i. e., the attention is modeled as spatial probabilities that re-weight the last conv-layer feature map of a CNN encoding an input image.

Image Captioning Sentence

207

Paper
Code