Search Results for author: Zengchang Qin

Found 22 papers, 6 papers with code

From Image to Video, what do we need in multimodal LLMs?

no code implementations • 18 Apr 2024 • Suyuan Huang, Haoxin Zhang, Yan Gao, Yao Hu, Zengchang Qin

Multimodal Large Language Models (MLLMs) have demonstrated profound capabilities in understanding multimodal information, covering from Image LLMs to the more complex Video LLMs.

Video Understanding

Paper
Add Code

LogicalDefender: Discovering, Extracting, and Utilizing Common-Sense Knowledge

no code implementations • 18 Mar 2024 • Yuhe Liu, Mengxue Kang, Zengchang Qin, Xiangxiang Chu

Experiments show that our model has achieved better logical performance, and the extracted logical knowledge can be effectively applied to other scenarios.

Common Sense Reasoning

Paper
Add Code

CADReN: Contextual Anchor-Driven Relational Network for Controllable Cross-Graphs Node Importance Estimation

no code implementations • 6 Feb 2024 • Zijie Zhong, Yunhui Zhang, Ziyi Chang, Zengchang Qin

CADReN is also proven to match the performance of previous models on single-graph NIE task.

Knowledge Graphs

Paper
Add Code

Boosting Semantic Segmentation from the Perspective of Explicit Class Embeddings

no code implementations • ICCV 2023 • Yuhe Liu, Chuanjian Liu, Kai Han, Quan Tang, Zengchang Qin

Following this observation, we propose ECENet, a new segmentation paradigm, in which class embeddings are obtained and enhanced explicitly during interacting with multi-stage image features.

Segmentation Semantic Segmentation

Paper
Add Code

Sparse Double Descent: Where Network Pruning Aggravates Overfitting

1 code implementation • 17 Jun 2022 • Zheng He, Zeke Xie, Quanzhi Zhu, Zengchang Qin

People usually believe that network pruning not only reduces the computational cost of deep networks, but also prevents overfitting by decreasing model capacity.

Network Pruning

Paper
Code

Reasoning with Multi-Structure Commonsense Knowledge in Visual Dialog

no code implementations • 10 Apr 2022 • Shunyu Zhang, Xiaoze Jiang, Zequn Yang, Tao Wan, Zengchang Qin

In our model, the external knowledge is represented with sentence-level facts and graph-level facts, to properly suit the scenario of the composite of dialog history and image.

Logical Reasoning Sentence +1

Paper
Add Code

Can network pruning benefit deep learning under label noise?

no code implementations • 29 Sep 2021 • Zheng He, Quanzhi Zhu, Zengchang Qin

Network pruning is a widely-used technique to reduce the computational cost of over-parameterized neural networks.

Network Pruning

Paper
Add Code

KBGN: Knowledge-Bridge Graph Network for Adaptive Vision-Text Reasoning in Visual Dialogue

no code implementations • 11 Aug 2020 • Xiaoze Jiang, Siyi Du, Zengchang Qin, Yajing Sun, Jing Yu

Visual dialogue is a challenging task that needs to extract implicit information from both visual (image) and textual (dialogue history) contexts.

Information Retrieval Retrieval

Paper
Add Code

DAM: Deliberation, Abandon and Memory Networks for Generating Detailed and Non-repetitive Responses in Visual Dialogue

4 code implementations • 7 Jul 2020 • Xiaoze Jiang, Jing Yu, Yajing Sun, Zengchang Qin, Zihao Zhu, Yue Hu, Qi Wu

The ability of generating detailed and non-repetitive responses is crucial for the agent to achieve human-like conversation.

Paper
Code

Multi-Level Network for High-Speed Multi-Person Pose Estimation

no code implementations • 26 Nov 2019 • Ying Huang, Jiankai Zhuang, Zengchang Qin

In multi-person pose estimation, the left/right joint type discrimination is always a hard problem because of the similar appearance.

Multi-Person Pose Estimation Vocal Bursts Intensity Prediction

Paper
Add Code

FollowMeUp Sports: New Benchmark for 2D Human Keypoint Recognition

no code implementations • 19 Nov 2019 • Ying Huang, Bin Sun, Haipeng Kan, Jiankai Zhuang, Zengchang Qin

Human pose estimation has made significant advancement in recent years.

Pose Estimation

Paper
Add Code

DualVD: An Adaptive Dual Encoding Model for Deep Visual Understanding in Visual Dialogue

1 code implementation • 17 Nov 2019 • Xiaoze Jiang, Jing Yu, Zengchang Qin, Yingying Zhuang, Xingxing Zhang, Yue Hu, Qi Wu

More importantly, we can tell which modality (visual or semantic) has more contribution in answering the current question by visualizing the gate values.

Ranked #6 on Visual Dialog on VisDial v0.9 val

feature selection Question Answering +2

Paper
Code

Scene Graph Reasoning with Prior Visual Relationship for Visual Question Answering

no code implementations • 23 Dec 2018 • Zhuoqian Yang, Zengchang Qin, Jing Yu, Yue Hu

Upon the constructed graph, we propose a Scene Graph Convolutional Network (SceneGCN) to jointly reason the object properties and relational semantics for the correct answer.

Cross-Modal Information Retrieval Information Retrieval +2

Paper
Add Code

Pixel Level Data Augmentation for Semantic Image Segmentation using Generative Adversarial Networks

no code implementations • 1 Nov 2018 • Shuangting Liu, Jia-Qi Zhang, Yuxin Chen, Yifan Liu, Zengchang Qin, Tao Wan

Semantic segmentation is one of the basic topics in computer vision, it aims to assign semantic labels to every pixel of an image.

Data Augmentation Image Segmentation +2

Paper
Add Code

A sequential guiding network with attention for image captioning

no code implementations • 1 Nov 2018 • Daouda Sow, Zengchang Qin, Mouhamed Niasse, Tao Wan

The recent advances of deep learning in both computer vision (CV) and natural language processing (NLP) provide us a new way of understanding semantics, by which we can deal with more challenging tasks such as automatic description generation from natural images.

Decoder Image Captioning

Paper
Add Code

Textual Relationship Modeling for Cross-Modal Information Retrieval

1 code implementation • 31 Oct 2018 • Jing Yu, Chenghao Yang, Zengchang Qin, Zhuoqian Yang, Yue Hu, Yanbing Liu

A joint neural model is proposed to learn feature representation individually in each modality.

Multimedia

Paper
Code

Modeling Text with Graph Convolutional Network for Cross-Modal Information Retrieval

no code implementations • 3 Feb 2018 • Jing Yu, Yuhang Lu, Zengchang Qin, Yanbing Liu, Jianlong Tan, Li Guo, Weifeng Zhang

A dual-path neural network model is proposed for couple feature learning in cross-modal information retrieval.

Cross-Modal Information Retrieval Information Retrieval +1

Paper
Add Code

Text Generation Based on Generative Adversarial Nets with Latent Variable

1 code implementation • 1 Dec 2017 • Heng Wang, Zengchang Qin, Tao Wan

We propose the VGAN model where the generative model is composed of recurrent neural network and VAE.

Language Modelling Text Generation

Paper
Code

Data Augmentation in Emotion Classification Using Generative Adversarial Networks

no code implementations • 2 Nov 2017 • Xinyue Zhu, Yifan Liu, Zengchang Qin, Jiahong Li

In this paper, we propose a data augmentation method using generative adversarial networks (GAN).

Classification Data Augmentation +2

Paper
Add Code

Logical Parsing from Natural Language Based on a Neural Translation Model

no code implementations • 9 May 2017 • Liang Li, Pengyu Li, Yifan Liu, Tao Wan, Zengchang Qin

Under our learning policy, the Seq2Seq model can learn mappings gradually with noises.

Question Answering Semantic Parsing +1

Paper
Add Code

Generative Cooperative Net for Image Generation and Data Augmentation

no code implementations • 8 May 2017 • Qiangeng Xu, Zengchang Qin, Tao Wan

In this paper, we explore a generative model for the task of generating unseen images with desired features.

Data Augmentation Facial expression generation +1

Paper
Add Code

Auto-painter: Cartoon Image Generation from Sketch by Using Conditional Generative Adversarial Networks

4 code implementations • 4 May 2017 • Yifan Liu, Zengchang Qin, Zhenbo Luo, Hua Wang

Learning to generate colorful cartoon images from black-and-white sketches is not only an interesting research problem, but also a potential application in digital entertainment.

Image Generation

109

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.