Search Results for author: Dong Zhang

Found 71 papers, 35 papers with code

Joint Multi-modal Aspect-Sentiment Analysis with Auxiliary Cross-modal Relation Detection

1 code implementation • EMNLP 2021 • Xincheng Ju, Dong Zhang, Rong Xiao, Junhui Li, Shoushan Li, Min Zhang, Guodong Zhou

Therefore, in this paper, we are the first to jointly perform multi-modal ATE (MATE) and multi-modal ASC (MASC), and we propose a multi-modal joint learning approach with auxiliary cross-modal relation detection for multi-modal aspect-level sentiment analysis (MALSA).

Relation Sentiment Analysis +1

Paper
Code

Multi-modal Multi-label Emotion Detection with Modality and Label Dependence

no code implementations • EMNLP 2020 • Dong Zhang, Xincheng Ju, Junhui Li, Shoushan Li, Qiaoming Zhu, Guodong Zhou

In this paper, we focus on multi-label emotion detection in a multi-modal scenario.

Paper
Add Code

On the Temperature of Machine Learning Systems

no code implementations • 19 Apr 2024 • Dong Zhang

We consider that the initial potential energy of a ML system is described by the model's loss functions, and the energy adheres to the principle of minimum potential energy.

Paper
Add Code

SpeechAlign: Aligning Speech Generation to Human Preferences

2 code implementations • 8 Apr 2024 • Dong Zhang, Zhaowei Li, ShiMin Li, Xin Zhang, Pengyu Wang, Yaqian Zhou, Xipeng Qiu

However, the integration of human feedback to align speech outputs to human preferences is often neglected.

Language Modelling

895

Paper
Code

Genetic Quantization-Aware Approximation for Non-Linear Operations in Transformers

1 code implementation • 28 Mar 2024 • Pingcheng Dong, Yonghao Tan, Dong Zhang, Tianwei Ni, Xuejiao Liu, Yu Liu, Peng Luo, Luhong Liang, Shih-Yang Liu, Xijie Huang, Huaiyu Zhu, Yun Pan, Fengwei An, Kwang-Ting Cheng

Non-linear functions are prevalent in Transformers and their lightweight variants, incurring substantial and frequently underestimated hardware costs.

Quantization Semantic Segmentation

Paper
Code

Unleashing Network Potentials for Semantic Scene Completion

1 code implementation • 12 Mar 2024 • Fengyun Wang, Qianru Sun, Dong Zhang, Jinhui Tang

Semantic scene completion (SSC) aims to predict complete 3D voxel occupancy and semantics from a single-view RGB-D image, and recent SSC methods commonly adopt multi-modal inputs.

Paper
Code

Location-guided Head Pose Estimation for Fisheye Image

no code implementations • 28 Feb 2024 • Bing Li, Dong Zhang, Cheng Huang, Yun Xian, Ming Li, Dah-Jye Lee

Camera with a fisheye or ultra-wide lens covers a wide field of view that cannot be modeled by the perspective projection.

Head Pose Estimation Multi-Task Learning

Paper
Add Code

AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling

1 code implementation • 19 Feb 2024 • Jun Zhan, Junqi Dai, Jiasheng Ye, Yunhua Zhou, Dong Zhang, Zhigeng Liu, Xin Zhang, Ruibin Yuan, Ge Zhang, Linyang Li, Hang Yan, Jie Fu, Tao Gui, Tianxiang Sun, Yugang Jiang, Xipeng Qiu

We introduce AnyGPT, an any-to-any multimodal language model that utilizes discrete representations for the unified processing of various modalities, including speech, text, images, and music.

Language Modelling Large Language Model

450

Paper
Code

Comment-aided Video-Language Alignment via Contrastive Pre-training for Short-form Video Humor Detection

1 code implementation • 14 Feb 2024 • Yang Liu, Tongfei Shen, Dong Zhang, Qingying Sun, Shoushan Li, Guodong Zhou

The growing importance of multi-modal humor detection within affective computing correlates with the expanding influence of short-form video sharing on social media platforms.

Humor Detection

Paper
Code

GenTranslate: Large Language Models are Generative Multilingual Speech and Machine Translators

1 code implementation • 10 Feb 2024 • Yuchen Hu, Chen Chen, Chao-Han Huck Yang, Ruizhe Li, Dong Zhang, Zhehuai Chen, Eng Siong Chng

Leveraging the rich linguistic knowledge and strong reasoning abilities of LLMs, our new paradigm can integrate the rich information in N-best candidates to generate a higher-quality translation result.

Machine Translation Translation

150

Paper
Code

Boundary and Relation Distillation for Semantic Segmentation

no code implementations • 24 Jan 2024 • Dong Zhang, Pingcheng Dong, Xinting Hu, Long Chen, Kwang-Ting Cheng

Concurrently, the relation distillation transfers implicit relations from the teacher model to the student model using pixel-level self-relation as a bridge, ensuring that the student's mask has strong target region connectivity.

Implicit Relations Knowledge Distillation +2

Paper
Add Code

SpeechGPT-Gen: Scaling Chain-of-Information Speech Generation

1 code implementation • 24 Jan 2024 • Dong Zhang, Xin Zhang, Jun Zhan, ShiMin Li, Yaqian Zhou, Xipeng Qiu

It comprises an autoregressive model based on LLM for semantic information modeling and a non-autoregressive model employing flow matching for perceptual information modeling.

Voice Conversion

895

Paper
Code

InferAligner: Inference-Time Alignment for Harmlessness through Cross-Model Guidance

1 code implementation • 20 Jan 2024 • Pengyu Wang, Dong Zhang, Linyang Li, Chenkun Tan, Xinghao Wang, Ke Ren, Botian Jiang, Xipeng Qiu

With the rapid development of large language models (LLMs), they are not only used as general-purpose AI assistants but are also customized through further fine-tuning to meet the requirements of different applications.

Paper
Code

BoNuS: Boundary Mining for Nuclei Segmentation with Partial Point Labels

1 code implementation • 15 Jan 2024 • Yi Lin, Zeyu Wang, Dong Zhang, Kwang-Ting Cheng, Hao Chen

To alleviate this problem, in this paper, we propose a weakly-supervised nuclei segmentation method that only requires partial point labels of nuclei.

Multiple Instance Learning Segmentation

Paper
Code

GroundingGPT:Language Enhanced Multi-modal Grounding Model

2 code implementations • 11 Jan 2024 • Zhaowei Li, Qi Xu, Dong Zhang, Hang Song, Yiqing Cai, Qi Qi, Ran Zhou, Junting Pan, Zefeng Li, Van Tu Vu, Zhida Huang, Tao Wang

Beyond capturing global information like other multi-modal models, our proposed model excels at tasks demanding a detailed understanding of local information within the input.

Language Modelling Large Language Model

201

Paper
Code

SpeechAgents: Human-Communication Simulation with Multi-Modal Multi-Agent Systems

1 code implementation • 8 Jan 2024 • Dong Zhang, Zhaowei Li, Pengyu Wang, Xin Zhang, Yaqian Zhou, Xipeng Qiu

In this paper, we propose SpeechAgents, a multi-modal LLM based multi-agent system designed for simulating human communication.

Language Modelling Large Language Model

Paper
Code

Towards SAMBA: Segment Anything Model for Brain Tumor Segmentation in Sub-Sharan African Populations

no code implementations • 19 Dec 2023 • Mohannad Barakat, Noha Magdy, Jjuuko George William, Ethel Phiri, Raymond Confidence, Dong Zhang, Udunna C Anazodo

This study was conducted on the Brain Tumor Segmentation (BraTS) Challenge Africa (BraTS-Africa) dataset, which provides a valuable resource for addressing challenges specific to resource-limited settings, particularly the African population, and facilitating the development of effective and more generalizable segmentation algorithms.

Brain Tumor Segmentation Segmentation +1

Paper
Add Code

Bridging the Gap: Generalising State-of-the-Art U-Net Models to Sub-Saharan African Populations

no code implementations • 19 Dec 2023 • Alyssa R. Amod, Alexandra Smith, Pearly Joubert, Confidence Raymond, Dong Zhang, Udunna C. Anazodo, Dodzi Motchon, Tinashe E. M. Mutsvangwa, Sébastien Quetin

We replicated a framework that secured the 2nd position in the 2022 BraTS competition to investigate the impact of dataset composition on model performance and pursued four distinct approaches through training a model with: 1) BraTS-Africa data only (train_SSA, N=60), 2) BraTS-Adult Glioma data only (train_GLI, N=1251), 3) both datasets together (train_ALL, N=1311), and 4) through further training the train_GLI model with BraTS-Africa data (train_ftSSA).

Paper
Add Code

Physics-Informed Neural Network for Discovering Systems with Unmeasurable States with Application to Lithium-Ion Batteries

no code implementations • 27 Nov 2023 • Yuichi Kajiura, Jorge Espin, Dong Zhang

In particular, instead of having loss terms from each differential equation, this method embeds the dynamics into a loss function that quantifies the error between observed and predicted system outputs.

Paper
Add Code

SeqXGPT: Sentence-Level AI-Generated Text Detection

1 code implementation • 13 Oct 2023 • Pengyu Wang, Linyang Li, Ke Ren, Botian Jiang, Dong Zhang, Xipeng Qiu

Therefore, it is important to build strong AI-generated text (AIGT) detectors.

Sentence Text Detection

Paper
Code

SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models

3 code implementations • 31 Aug 2023 • Xin Zhang, Dong Zhang, ShiMin Li, Yaqian Zhou, Xipeng Qiu

Therefore, we propose SpeechTokenizer, a unified speech tokenizer for speech large language models.

Language Modelling Quantization

280

Paper
Code

Synthetic Instance Segmentation from Semantic Image Segmentation Masks

1 code implementation • 2 Aug 2023 • Yuchen Shen, Dong Zhang, yuhui Zheng, Zechao Li, Liyong Fu, Qiaolin Ye

SISeg does not require training a semantic or/and instance segmentation model and avoids the need for instance-level image annotations.

Image Segmentation Instance Segmentation +3

Paper
Code

Improving Reference-based Distinctive Image Captioning with Contrastive Rewards

no code implementations • 25 Jun 2023 • Yangjun Mao, Jun Xiao, Dong Zhang, Meng Cao, Jian Shao, Yueting Zhuang, Long Chen

A recent DIC method proposes to generate distinctive captions by comparing the target image with a set of semantic-similar reference images, i. e., reference-based DIC (Ref-DIC).

Benchmarking Contrastive Learning +1

Paper
Add Code

The Brain Tumor Segmentation (BraTS) Challenge 2023: Glioma Segmentation in Sub-Saharan Africa Patient Population (BraTS-Africa)

no code implementations • 30 May 2023 • Maruf Adewole, Jeffrey D. Rudie, Anu Gbadamosi, Oluyemisi Toyobo, Confidence Raymond, Dong Zhang, Olubukola Omidiji, Rachel Akinola, Mohammad Abba Suwaid, Adaobi Emegoakor, Nancy Ojo, Kenneth Aguh, Chinasa Kalaiwo, Gabriel Babatunde, Afolabi Ogunleye, Yewande Gbadamosi, Kator Iorpagher, Evan Calabrese, Mariam Aboian, Marius Linguraru, Jake Albrecht, Benedikt Wiestler, Florian Kofler, Anastasia Janas, Dominic LaBella, Anahita Fathi Kzerooni, Hongwei Bran Li, Juan Eugenio Iglesias, Keyvan Farahani, James Eddy, Timothy Bergquist, Verena Chung, Russell Takeshi Shinohara, Walter Wiggins, Zachary Reitman, Chunhao Wang, Xinyang Liu, Zhifan Jiang, Ariana Familiar, Koen van Leemput, Christina Bukas, Maire Piraud, Gian-Marco Conte, Elaine Johansson, Zeke Meier, Bjoern H Menze, Ujjwal Baid, Spyridon Bakas, Farouk Dako, Abiodun Fatade, Udunna C Anazodo

Thus, the BraTS-Africa Challenge provides a unique opportunity to include brain MRI glioma cases from SSA in global efforts through the BraTS Challenge to develop and evaluate computer-aided-diagnostic (CAD) methods for the detection and characterization of glioma in resource-limited settings, where the potential for CAD tools to transform healthcare are more likely.

Brain Tumor Segmentation Tumor Segmentation

Paper
Add Code

DUB: Discrete Unit Back-translation for Speech Translation

1 code implementation • 19 May 2023 • Dong Zhang, Rong Ye, Tom Ko, Mingxuan Wang, Yaqian Zhou

The key point is to bridge the modality gap between speech and text so that useful MT techniques can be applied to ST.

Machine Translation Speech-to-Text Translation +1

Paper
Code

SpeechGPT: Empowering Large Language Models with Intrinsic Cross-Modal Conversational Abilities

1 code implementation • 18 May 2023 • Dong Zhang, ShiMin Li, Xin Zhang, Jun Zhan, Pengyu Wang, Yaqian Zhou, Xipeng Qiu

Multi-modal large language models are regarded as a crucial step towards Artificial General Intelligence (AGI) and have garnered significant interest with the emergence of ChatGPT.

Language Modelling Large Language Model +2

895

Paper
Code

Should ChatGPT and Bard Share Revenue with Their Data Providers? A New Business Model for the AI Era

no code implementations • 4 May 2023 • Dong Zhang

Sharing revenue with data providers using such a scoring system would encourage more data owners to participate in the revenue-sharing program.

Paper
Add Code

Rethinking Boundary Detection in Deep Learning Models for Medical Image Segmentation

1 code implementation • 1 May 2023 • Yi Lin, Dong Zhang, Xiao Fang, Yufan Chen, Kwang-Ting Cheng, Hao Chen

Medical image segmentation is a fundamental task in the community of medical image analysis.

Boundary Detection Image Segmentation +3

Paper
Code

Discrepancy-Guided Reconstruction Learning for Image Forgery Detection

no code implementations • 26 Apr 2023 • Zenan Shi, Haipeng Chen, Long Chen, Dong Zhang

In this paper, we propose a novel image forgery detection paradigm for boosting the model learning capacity on both forgery-sensitive and genuine compact visual patterns.

Image Forgery Detection

Paper
Add Code

Coupling Global Context and Local Contents for Weakly-Supervised Semantic Segmentation

1 code implementation • 18 Apr 2023 • Chunyan Wang, Dong Zhang, Liyan Zhang, Jinhui Tang

Specifically, a flexible context aggregation module is proposed to capture the global object context in different granular spaces.

Weakly supervised Semantic Segmentation Weakly-Supervised Semantic Segmentation

Paper
Code

Boosting Convolution with Efficient MLP-Permutation for Volumetric Medical Image Segmentation

no code implementations • 23 Mar 2023 • Yi Lin, Xiao Fang, Dong Zhang, Kwang-Ting Cheng, Hao Chen

Recently, the advent of vision Transformer (ViT) has brought substantial advancements in 3D dataset benchmarks, particularly in 3D volumetric medical image segmentation (Vol-MedSeg).

Image Segmentation Semantic Segmentation +1

Paper
Add Code

Semantic Scene Completion with Cleaner Self

1 code implementation • CVPR 2023 • Fengyun Wang, Dong Zhang, Hanwang Zhang, Jinhui Tang, Qianru Sun

SSC is a well-known ill-posed problem as the prediction model has to "imagine" what is behind the visible surface, which is usually represented by Truncated Signed Distance Function (TSDF).

Paper
Code

Vessel-Promoted OCT to OCTA Image Translation by Heuristic Contextual Constraints

1 code implementation • 13 Mar 2023 • Shuhan LI, Dong Zhang, Xiaomeng Li, Chubin Ou, Lin An, Yanwu Xu, Kwang-Ting Cheng

In this paper, we propose a novel framework, TransPro, that translates 3D Optical Coherence Tomography (OCT) images into exclusive 3D OCTA images using an image translation pattern.

Translation

Paper
Code

Deep-Learning Tool for Early Identifying Non-Traumatic Intracranial Hemorrhage Etiology based on CT Scan

no code implementations • 2 Feb 2023 • Meng Zhao, Yifan Hu, Ruixuan Jiang, Yuanli Zhao, Dong Zhang, Yan Zhang, Rong Wang, Yong Cao, Qian Zhang, Yonggang Ma, Jiaxi Li, Shaochen Yu, Wenjie Li, Ran Zhang, Yefeng Zheng, Shuo Wang, Jizong Zhao

Conclusions: The proposed deep learning algorithms can be an effective tool for early identification of hemorrhage etiologies based on NCCT scans.

Specificity

Paper
Add Code

Protocol selection for second-order consensus against disturbance

no code implementations • 10 Dec 2022 • Jiamin Wang, Liqi Zhou, Dong Zhang, Jian Liu, Yuanshi Zheng

Noticing that both the absolute and relative velocity protocols can solve the second-order consensus of multi-agent systems, this paper aims to investigate which of the above two protocols has better anti-disturbance capability, in which the anti-disturbance capability is measured by the L2 gain from the disturbance to the consensus error.

Paper
Add Code

Centralized Feature Pyramid for Object Detection

1 code implementation • 5 Oct 2022 • Yu Quan, Dong Zhang, Liyan Zhang, Jinhui Tang

To address this problem, in this paper, we propose a Centralized Feature Pyramid (CFP) for object detection, which is based on a globally explicit centralized feature regulation.

Object object-detection +1

220

Paper
Code

Understanding the Tricks of Deep Learning in Medical Image Segmentation: Challenges and Future Directions

1 code implementation • 21 Sep 2022 • Dong Zhang, Yi Lin, Hao Chen, Zhuotao Tian, Xin Yang, Jinhui Tang, Kwang Ting Cheng

Over the past few years, the rapid development of deep learning technologies for computer vision has significantly improved the performance of medical image segmentation (MedISeg).

Data Augmentation Domain Adaptation +3

273

Paper
Code

Graph Reasoning Transformer for Image Parsing

no code implementations • 20 Sep 2022 • Dong Zhang, Jinhui Tang, Kwang-Ting Cheng

In this paper, we propose a novel Graph Reasoning Transformer (GReaT) for image parsing to enable image patches to interact following a relation reasoning pattern.

Relation

Paper
Add Code

Rethinking the Reference-based Distinctive Image Captioning

1 code implementation • 22 Jul 2022 • Yangjun Mao, Long Chen, Zhihong Jiang, Dong Zhang, Zhimeng Zhang, Jian Shao, Jun Xiao

Unfortunately, reference images used by existing Ref-DIC works are easy to distinguish: these reference images only resemble the target image at scene-level and have few common objects, such that a Ref-DIC model can trivially generate distinctive captions even without considering the reference images.

Attribute Benchmarking +1

Paper
Code

FedMix: Mixed Supervised Federated Learning for Medical Image Segmentation

1 code implementation • 4 May 2022 • Jeffry Wicaksana, Zengqiang Yan, Dong Zhang, Xijie Huang, Huimin Wu, Xin Yang, Kwang-Ting Cheng

To relax this assumption, in this work, we propose a label-agnostic unified federated learning framework, named FedMix, for medical image segmentation based on mixed image labels.

Federated Learning Image Segmentation +4

Paper
Code

Learning to Reduce Information Bottleneck for Object Detection in Aerial Images

1 code implementation • 5 Apr 2022 • Yuchen Shen, Dong Zhang, Zhihao Song, Xuesong Jiang, Qiaolin Ye

In this letter, we first underline the importance of the neck network in object detection from the perspective of information bottleneck.

object-detection Object Detection In Aerial Images

Paper
Code

The Machine Learning for Combinatorial Optimization Competition (ML4CO): Results and Insights

2 code implementations • 4 Mar 2022 • Maxime Gasse, Quentin Cappart, Jonas Charfreitag, Laurent Charlin, Didier Chételat, Antonia Chmiela, Justin Dumouchelle, Ambros Gleixner, Aleksandr M. Kazachkov, Elias Khalil, Pawel Lichocki, Andrea Lodi, Miles Lubin, Chris J. Maddison, Christopher Morris, Dimitri J. Papageorgiou, Augustin Parjadis, Sebastian Pokutta, Antoine Prouvost, Lara Scavuzzo, Giulia Zarpellon, Linxin Yang, Sha Lai, Akang Wang, Xiaodong Luo, Xiang Zhou, Haohan Huang, Shengcheng Shao, Yuanming Zhu, Dong Zhang, Tao Quan, Zixuan Cao, Yang Xu, Zhewei Huang, Shuchang Zhou, Chen Binbin, He Minggui, Hao Hao, Zhang Zhiyu, An Zhiwu, Mao Kun

Combinatorial optimization is a well-established area in operations research and computer science.

BIG-bench Machine Learning Combinatorial Optimization

121

Paper
Code

FaceAtlasAR: Atlas of Facial Acupuncture Points in Augmented Reality

1 code implementation • 29 Nov 2021 • Menghe Zhang, Jurgen Schulze, Dong Zhang

Acupuncture is a technique in which practitioners stimulate specific points on the body.

Face Alignment

Paper
Code

Towards Domain-Independent and Real-Time Gesture Recognition Using mmWave Signal

1 code implementation • 11 Nov 2021 • Yadong Li, Dongheng Zhang, Jinbo Chen, Jinwei Wan, Dong Zhang, Yang Hu, Qibin Sun, Yan Chen

To enhance the robustness of the system and reduce data collecting efforts, we design a data augmentation framework for mmWave signals based on correlations between signal patterns and gesture variations.

Data Augmentation Gesture Recognition

Paper
Code

Cell-Level State of Charge Estimation for Battery Packs Under Minimal Sensing

no code implementations • 17 Sep 2021 • Dong Zhang, Luis D. Couto, Ross Drummond, Shashank Sripad, Venkatasubramanian Viswanathan

This manuscript presents an algorithm for individual Lithium-ion (Li-ion) battery cell state of charge (SOC) estimation in a large-scale battery pack under minimal sensing, where only pack-level voltage and current are measured.

Paper
Add Code

More than Text: Multi-modal Chinese Word Segmentation

1 code implementation • ACL 2021 • Dong Zhang, Zheng Hu, Shoushan Li, Hanqian Wu, Qiaoming Zhu, Guodong Zhou

Chinese word segmentation (CWS) is undoubtedly an important basic task in natural language processing.

Chinese Word Segmentation

Paper
Code

Region-Aware Network: Model Human's Top-Down Visual Perception Mechanism for Crowd Counting

no code implementations • 23 Jun 2021 • Yuehai Chen, Jing Yang, Dong Zhang, Kun Zhang, Badong Chen, Shaoyi Du

More specifically, we scan the whole input images and its priority maps in the form of column vector to obtain a relevance matrix estimating their similarity.

Crowd Counting

Paper
Add Code

Learning Calibrated-Guidance for Object Detection in Aerial Images

1 code implementation • 21 Mar 2021 • Zongqi Wei, Dong Liang, Dong Zhang, Liyan Zhang, Qixiang Geng, Mingqiang Wei, Huiyu Zhou

Specifically, for a given set of feature maps, CG first computes the feature similarity between each channel and the remaining channels as the intermediary calibration guidance.

Object object-detection +2

Paper
Code

Machine Learning based Malicious Payload Identification in Software-Defined Networking

no code implementations • 4 Jan 2021 • Qiumei Cheng, Chunming Wu, Haifeng Zhou, Dezhang Kong, Dong Zhang, Junchi Xing, Wei Ruan

In this paper, a novel OpenFlow-enabled deep packet inspection (OFDPI) approach is proposed based on the SDN paradigm to provide adaptive and efficient packet inspection.

Networking and Internet Architecture

Paper
Add Code

Deep image prior for undersampling high-speed photoacoustic microscopy

no code implementations • 15 Oct 2020 • Tri Vu, Anthony DiSpirito III, Daiwei Li, Zixuan Zhang, Xiaoyi Zhu, Maomao Chen, Laiming Jiang, Dong Zhang, Jianwen Luo, Yu Shrike Zhang, Qifa Zhou, Roarke Horstmeyer, Junjie Yao

Photoacoustic microscopy (PAM) is an emerging imaging method combining light and sound.

Vocal Bursts Intensity Prediction

Paper
Add Code

Causal Intervention for Weakly-Supervised Semantic Segmentation

1 code implementation • NeurIPS 2020 • Dong Zhang, Hanwang Zhang, Jinhui Tang, Xian-Sheng Hua, Qianru Sun

We present a causal inference framework to improve Weakly-Supervised Semantic Segmentation (WSSS).

Ranked #36 on Weakly-Supervised Semantic Segmentation on COCO 2014 val

Attribute Causal Inference +3

181

Paper
Code

Dual-SLAM: A framework for robust single camera navigation

no code implementations • 23 Sep 2020 • Huajian Huang, Wen-Yan Lin, Siying Liu, Dong Zhang, Sai-Kit Yeung

As local pose estimation is ill-conditioned, local pose estimation failures happen regularly, making the overall SLAM system brittle.

Pose Estimation Simultaneous Localization and Mapping

Paper
Add Code

Mask Detection and Breath Monitoring from Speech: on Data Augmentation, Feature Representation and Modeling

no code implementations • 12 Aug 2020 • Haiwei Wu, Lin Zhang, Lin Yang, Xuyang Wang, Jun-Jie Wang, Dong Zhang, Ming Li

This paper introduces our approaches for the Mask and Breathing Sub-Challenge in the Interspeech COMPARE Challenge 2020.

Data Augmentation

Paper
Add Code

Feature Pyramid Transformer

1 code implementation • ECCV 2020 • Dong Zhang, Hanwang Zhang, Jinhui Tang, Meng Wang, Xiansheng Hua, Qianru Sun

Yet, the non-local spatial interactions are not across scales, and thus they fail to capture the non-local contexts of objects (or parts) residing in different scales.

Instance Segmentation object-detection +3

397

Paper
Code

Reconstructing undersampled photoacoustic microscopy images using deep learning

2 code implementations • 30 May 2020 • Anthony DiSpirito III, Daiwei Li, Tri Vu, Maomao Chen, Dong Zhang, Jianwen Luo, Roarke Horstmeyer, Junjie Yao

One primary technical challenge in photoacoustic microscopy (PAM) is the necessary compromise between spatial resolution and imaging speed.

3D Action Recognition

Paper
Code

A multi-label classification method using a hierarchical and transparent representation for paper-reviewer recommendation

no code implementations • 19 Dec 2019 • Dong Zhang, Shu Zhao, Zhen Duan, Jie Chen, Yangping Zhang, Jie Tang

Paper-reviewer recommendation task is of significant academic importance for conference chairs and journal editors.

General Classification Multi-Label Classification

Paper
Add Code

Direct Quantification for Coronary Artery Stenosis Using Multiview Learning

no code implementations • 20 Jul 2019 • Dong Zhang, Guang Yang, Shu Zhao, Yanping Zhang, Heye Zhang, Shuo Li

The proposed DMQCA model consists of a multiview module with two attention mechanisms, a key-frame module, and a regression module, to achieve direct accurate multiple-index estimation.

Multiview Learning regression

Paper
Add Code

Modeling both context- and speaker-sensitive dependence for emotion detection in multi-speaker conversations

no code implementations • IJCAI 2019 • Dong Zhang, Liangqing Wu, Changlong Sun, Shoushan Li, Qiaoming Zhu, Guodong Zhou

On the one hand, our approach represents each utterance and each speaker as a node.

Ranked #56 on Emotion Recognition in Conversation on MELD

Emotion Recognition in Conversation

Paper
Add Code

Composition Loss for Counting, Density Map Estimation and Localization in Dense Crowds

no code implementations • ECCV 2018 • Haroon Idrees, Muhmmad Tayyab, Kishan Athrey, Dong Zhang, Somaya Al-Maadeed, Nasir Rajpoot, Mubarak Shah

With multiple crowd gatherings of millions of people every year in events ranging from pilgrimages to protests, concerts to marathons, and festivals to funerals; visual crowd analysis is emerging as a new frontier in computer vision.

Ranked #12 on Crowd Counting on UCF-QNRF

Crowd Counting Management +1

Paper
Add Code

Video Fill In the Blank using LR/RL LSTMs with Spatial-Temporal Attentions

1 code implementation • ICCV 2017 • Amir Mazaheri, Dong Zhang, Mubarak Shah

Since the source sentence is broken into two fragments: the sentence's left fragment (before the blank) and the sentence's right fragment (after the blank), traditional Recurrent Neural Networks cannot encode this structure accurately because of many possible variations of the missing word in terms of the location and type of the word in the source sentence.

Sentence

Paper
Code

ClusterNet: Detecting Small Objects in Large Scenes by Exploiting Spatio-Temporal Information

no code implementations • CVPR 2018 • Rodney LaLonde, Dong Zhang, Mubarak Shah

To reduce the large search space, the first stage (ClusterNet) takes in a set of extremely large video frames, combines the motion and appearance information within the convolutional architecture, and proposes regions of objects of interest (ROOBI).

Object object-detection +1

Paper
Add Code

Unsupervised Action Proposal Ranking through Proposal Recombination

no code implementations • 3 Apr 2017 • Waqas Sultani, Dong Zhang, Mubarak Shah

Given the action proposals in a video, the goal of the proposed work is to generate a few better action proposals that are ranked properly.

Action Detection Action Recognition +1

Paper
Add Code

User Classification with Multiple Textual Perspectives

no code implementations • COLING 2016 • Dong Zhang, Shoushan Li, Hongling Wang, Guodong Zhou

Textual information is of critical importance for automatic user classification in social media.

Age Classification Classification +3

Paper
Add Code

Two-View Label Propagation to Semi-supervised Reader Emotion Classification

no code implementations • COLING 2016 • Shoushan Li, Jian Xu, Dong Zhang, Guodong Zhou

In this paper, we propose a two-view label propagation approach to semi-supervised reader emotion classification by exploiting two views, namely source text and response text in a label propagation algorithm.

Classification Emotion Classification +2

Paper
Add Code

Video Fill in the Blank with Merging LSTMs

no code implementations • 13 Oct 2016 • Amir Mazaheri, Dong Zhang, Mubarak Shah

In the experiments, we have demonstrated the superior performance of the proposed method on the challenging "Movie Fill-in-the-Blank" dataset.

Paper
Add Code

Local feature hierarchy for face recognition across pose and illumination

no code implementations • 12 Jul 2016 • Xiaoyue Jiang, Dong Zhang, Xiaoyi Feng

Accordingly we propose an end-to-end face recognition method to deal with pose and illumination simultaneously based on convolutional networks where the discriminative nonlinear features that are invariant to pose and illumination are extracted.

Face Recognition

Paper
Add Code

A Framework for Human Pose Estimation in Videos

no code implementations • 26 Apr 2016 • Dong Zhang, Mubarak Shah

A sequence of the best poses is inferred from the abstract body part tracklets through the tree-based optimization.

Pose Estimation

Paper
Add Code

Robust Scene Text Recognition Using Sparse Coding based Features

no code implementations • 29 Dec 2015 • Da-Han Wang, Hanzi Wang, Dong Zhang, Jonathan Li, David Zhang

For character detection, we use the HSC features instead of using the Histograms of Oriented Gradients (HOG) features.

Scene Text Recognition

Paper
Add Code

Human Pose Estimation in Videos

no code implementations • ICCV 2015 • Dong Zhang, Mubarak Shah

Using the idea of `Association', the optimal tracklets are generated for each abstract body part, in order to enforce the spatiotemporal constraints between body parts in adjacent frames.

Pose Estimation

Paper
Add Code

Face Verification Using Boosted Cross-Image Features

no code implementations • 28 Sep 2013 • Dong Zhang, Omar Oreifej, Mubarak Shah

In contrast, we propose to extract cross-image features, i. e. features across the pair of images, which, as we demonstrate, is more discriminative to the similarity and the dissimilarity of faces.

Face Detection Face Recognition +1

Paper
Add Code

Video Object Segmentation through Spatially Accurate and Temporally Dense Extraction of Primary Object Regions

no code implementations • CVPR 2013 • Dong Zhang, Omar Javed, Mubarak Shah

The proposed approach has several contributions: First, a novel layered Directed Acyclic Graph (DAG) based framework is presented for detection and segmentation of the primary object in video.

Object Optical Flow Estimation +4

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.