Search Results for author: Shutao Li

Found 26 papers, 9 papers with code

VPAI_Lab at MedVidQA 2022: A Two-Stage Cross-modal Fusion Method for Medical Instructional Video Classification

1 code implementation • BioNLP (ACL) 2022 • Bin Li, Yixuan Weng, Fei Xia, Bin Sun, Shutao Li

Given an input video, the MedVidCL task aims to correctly classify it into one of three following categories: Medical Instructional, Medical Non-instructional, and Non-medical.

Video Classification

Paper
Code

Continuing Pre-trained Model with Multiple Training Strategies for Emotional Classification

no code implementations • WASSA (ACL) 2022 • Bin Li, Yixuan Weng, Qiya Song, Bin Sun, Shutao Li

This paper describes the contribution of the LingJing team’s method to the Workshop on Computational Approaches to Subjectivity, Sentiment & Social Media Analysis (WASSA) 2022 shared task on Emotion Classification.

Attribute Classification +4

Paper
Add Code

LingJing at SemEval-2022 Task 1: Multi-task Self-supervised Pre-training for Multilingual Reverse Dictionary

2 code implementations • SemEval (NAACL) 2022 • Bin Li, Yixuan Weng, Fei Xia, Shizhu He, Bin Sun, Shutao Li

This paper introduces the approach of Team LingJing’s experiments on SemEval-2022 Task 1 Comparing Dictionaries and Word Embeddings (CODWOE).

Reverse Dictionary Word Embeddings

Paper
Code

LingJing at SemEval-2022 Task 3: Applying DeBERTa to Lexical-level Presupposed Relation Taxonomy with Knowledge Transfer

1 code implementation • SemEval (NAACL) 2022 • Fei Xia, Bin Li, Yixuan Weng, Shizhu He, Bin Sun, Shutao Li, Kang Liu, Jun Zhao

For the classification sub-task, we adopt the DeBERTa-v3 pre-trained model for fine-tuning datasets of different languages.

Binary Classification Classification +2

Paper
Code

Modeling the Label Distributions for Weakly-Supervised Semantic Segmentation

1 code implementation • 20 Mar 2024 • Linshan Wu, Zhun Zhong, Jiayi Ma, Yunchao Wei, Hao Chen, Leyuan Fang, Shutao Li

Based on the label distributions, we leverage the GMM to generate high-quality pseudo labels for more reliable supervision.

Weakly supervised Semantic Segmentation Weakly-Supervised Semantic Segmentation

Paper
Code

GeReA: Question-Aware Prompt Captions for Knowledge-based Visual Question Answering

no code implementations • 4 Feb 2024 • Ziyu Ma, Shutao Li, Bin Sun, Jianfei Cai, Zuxiang Long, Fuyan Ma

Therefore, we propose GeReA, a generate-reason framework that prompts a MLLM like InstructBLIP with question relevant vision and language information to generate knowledge-relevant descriptions and reasons those descriptions for knowledge-based VQA.

Language Modelling Large Language Model +3

Paper
Add Code

Hyperspectral Image Fusion via Logarithmic Low-rank Tensor Ring Decomposition

no code implementations • 16 Oct 2023 • Jun Zhang, Lipeng Zhu, Chao Wang, Shutao Li

On the other hand, the tensor nuclear norm (TNN)-based approaches have recently demonstrated to be more efficient on keeping high-dimensional low-rank structures in tensor recovery.

valid

Paper
Add Code

VPUFormer: Visual Prompt Unified Transformer for Interactive Image Segmentation

1 code implementation • 11 Jun 2023 • Xu Zhang, Kailun Yang, Jiacheng Lin, Jin Yuan, Zhiyong Li, Shutao Li

Specifically, we design a Prompt-unified Encoder (PuE) by using Gaussian mapping to generate a unified one-dimensional vector for click, box, and scribble prompts, which well captures users' intentions as well as provides a denser representation of user prompts.

Image Segmentation Segmentation +1

Paper
Code

AdaptiveClick: Clicks-aware Transformer with Adaptive Focal Loss for Interactive Image Segmentation

1 code implementation • 7 May 2023 • Jiacheng Lin, Jiajun Chen, Kailun Yang, Alina Roitberg, Siyu Li, Zhiyong Li, Shutao Li

Interactive Image Segmentation (IIS) has emerged as a promising technique for decreasing annotation time.

Decoder Image Segmentation +2

Paper
Code

LOGO-Former: Local-Global Spatio-Temporal Transformer for Dynamic Facial Expression Recognition

no code implementations • 5 May 2023 • Fuyan Ma, Bin Sun, Shutao Li

Previous methods for dynamic facial expression recognition (DFER) in the wild are mainly based on Convolutional Neural Networks (CNNs), whose local operations ignore the long-range dependencies in videos.

Dynamic Facial Expression Recognition Facial Expression Recognition

Paper
Add Code

Learning to Locate Visual Answer in Video Corpus Using Question

1 code implementation • 11 Oct 2022 • Bin Li, Yixuan Weng, Bin Sun, Shutao Li

We introduce a new task, named video corpus visual answer localization (VCVAL), which aims to locate the visual answer in a large collection of untrimmed instructional videos using a natural language question.

Contrastive Learning Language Modelling +2

Paper
Code

Scene-Aware Prompt for Multi-modal Dialogue Understanding and Generation

no code implementations • 5 Jul 2022 • Bin Li, Yixuan Weng, Ziyu Ma, Bin Sun, Shutao Li

To fully leverage the visual information for both scene understanding and dialogue generation, we propose the scene-aware prompt for the MDUG task.

Dialogue Generation Dialogue Understanding +2

Paper
Add Code

Spatio-Temporal Transformer for Dynamic Facial Expression Recognition in the Wild

no code implementations • 10 May 2022 • Fuyan Ma, Bin Sun, Shutao Li

Previous methods for dynamic facial expression in the wild are mainly based on Convolutional Neural Networks (CNNs), whose local operations ignore the long-range dependencies in videos.

Ranked #5 on Dynamic Facial Expression Recognition on FERV39k

Dynamic Facial Expression Recognition Facial Expression Recognition +1

Paper
Add Code

LingYi: Medical Conversational Question Answering System based on Multi-modal Knowledge Graphs

1 code implementation • 20 Apr 2022 • Fei Xia, Bin Li, Yixuan Weng, Shizhu He, Kang Liu, Bin Sun, Shutao Li, Jun Zhao

The medical conversational system can relieve the burden of doctors and improve the efficiency of healthcare, especially during the pandemic.

Conversational Question Answering Dialogue Generation +3

Paper
Code

Towards Visual-Prompt Temporal Answering Grounding in Medical Instructional Video

no code implementations • 13 Mar 2022 • Bin Li, Yixuan Weng, Bin Sun, Shutao Li

However, due to the weak correlations and huge gaps of the semantic features between the textual question and visual answer, existing methods adopting visual span predictor perform poorly in the TAGV task.

Language Modelling Question Answering +2

Paper
Add Code

PSG: Prompt-based Sequence Generation for Acronym Extraction

no code implementations • 29 Nov 2021 • Bin Li, Fei Xia, Yixuan Weng, Xiusheng Huang, Bin Sun, Shutao Li

In this paper, we propose a Prompt-based Sequence Generation (PSG) method for the acronym extraction task.

document understanding Language Modelling +1

Paper
Add Code

Hybrid Mutimodal Fusion for Dimensional Emotion Recognition

no code implementations • 16 Oct 2021 • Ziyu Ma, Fuyan Ma, Bin Sun, Shutao Li

For the MuSe-Stress sub-challenge, we highlight our solutions in three aspects: 1) the audio-visual features and the bio-signal features are used for emotional state recognition.

Emotion Recognition

Paper
Add Code

More but Correct: Generating Diversified and Entity-revised Medical Response

no code implementations • 3 Aug 2021 • Bin Li, Encheng Chen, Hongru Liu, Yixuan Weng, Bin Sun, Shutao Li, Yongping Bai, Meiling Hu

Medical Dialogue Generation (MDG) is intended to build a medical dialogue system for intelligent consultation, which can communicate with patients in real-time, thereby improving the efficiency of clinical diagnosis with broad application prospects.

Dialogue Generation

Paper
Add Code

Facial Expression Recognition with Visual Transformers and Attentional Selective Fusion

no code implementations • 31 Mar 2021 • Fuyan Ma, Bin Sun, Shutao Li

Facial Expression Recognition (FER) in the wild is extremely challenging due to occlusions, variant head poses, face deformation and motion blur under unconstrained conditions.

Facial Expression Recognition Facial Expression Recognition (FER)

Paper
Add Code

Fusion of Dual Spatial Information for Hyperspectral Image Classification

1 code implementation • 23 Oct 2020 • Puhong Duan, Pedram Ghamisi, Xudong Kang, Behnood Rasti, Shutao Li, Richard Gloaguen

In the spatial optimization stage, a pixel-level classifier is used to obtain the class probability followed by an extended random walker-based spatial optimization technique.

Classification General Classification +1

Paper
Code

Recent Advances and New Guidelines on Hyperspectral and Multispectral Image Fusion

no code implementations • 8 Aug 2020 • Renwei Dian, Shutao Li, Bin Sun, Anjing Guo

Hyperspectral image (HSI) with high spectral resolution often suffers from low spatial resolution owing to the limitations of imaging sensors.

Paper
Add Code

Naive Gabor Networks for Hyperspectral Image Classification

no code implementations • 9 Dec 2019 • Chenying Liu, Jun Li, Lin He, Antonio J. Plaza, Shutao Li, Bo Li

Specifically, we develop an innovative phase-induced Gabor kernel, which is trickily designed to perform the Gabor feature learning via a linear combination of local low-frequency and high-frequency components of data controlled by the kernel phase.

Classification General Classification +1

Paper
Add Code

Deep Learning for Hyperspectral Image Classification: An Overview

no code implementations • 26 Oct 2019 • Shutao Li, Weiwei Song, Leyuan Fang, Yushi Chen, Pedram Ghamisi, Jón Atli Benediktsson

Specifically, we first summarize the main challenges of HSI classification which cannot be effectively overcome by traditional machine learning methods, and also introduce the advantages of deep learning to handle these problems.

BIG-bench Machine Learning Classification +2

Paper
Add Code

Deep Hashing Learning for Visual and Semantic Retrieval of Remote Sensing Images

no code implementations • 10 Sep 2019 • Weiwei Song, Shutao Li, Jon Atli Benediktsson

Although retrieval methods have achieved great success, there is still a question that needs to be responded to: Can we obtain the accurate semantic labels of the returned similar images to further help analyzing and processing imagery?

Deep Hashing Image Retrieval +1

Paper
Add Code

Hyperspectral Image Super-Resolution via Non-Local Sparse Tensor Factorization

no code implementations • CVPR 2017 • Renwei Dian, Leyuan Fang, Shutao Li

In this paper, a novel HSI super-resolution method based on non-local sparse tensor factorization (called as the NLSTF) is proposed.

Hyperspectral Image Super-Resolution Image Super-Resolution

Paper
Add Code

Feature Extraction of Hyperspectral Images With Image Fusion and Recursive Filtering

no code implementations • IEEE Transactions on Geoscience and Remote Sensing 2013 • Xudong Kang, Shutao Li, Jón Atli Benediktsson

Feature extraction is known to be an effective way in both reducing computational complexity and increasing accuracy of hyperspectral image classification.

Ranked #5 on Hyperspectral Image Classification on Pavia University

Classification Computational Efficiency +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.