Search Results for author: Lan Du

Found 43 papers, 12 papers with code

Multilingual Neural Machine Translation: Can Linguistic Hierarchies Help?

no code implementations • Findings (EMNLP) 2021 • Fahimeh Saleh, Wray Buntine, Gholamreza Haffari, Lan Du

Multilingual Neural Machine Translation (MNMT) trains a single NMT model that supports translation between multiple languages, rather than training separate models for different languages.

Knowledge Distillation Machine Translation +2

Paper
Add Code

Federated Distillation: A Survey

no code implementations • 2 Apr 2024 • Lin Li, Jianping Gou, Baosheng Yu, Lan Du, Zhang Yiand Dacheng Tao

Federated Learning (FL) seeks to train a model collaboratively without sharing private training data from individual clients.

Federated Learning Knowledge Distillation +1

Paper
Add Code

Harnessing the Power of Beta Scoring in Deep Active Learning for Multi-Label Text Classification

no code implementations • 15 Jan 2024 • Wei Tan, Ngoc Dang Nguyen, Lan Du, Wray Buntine

Within the scope of natural language processing, the domain of multi-label text classification is uniquely challenging due to its expansive and uneven label distribution.

Active Learning Multi Label Text Classification +2

Paper
Add Code

Bayesian Estimate of Mean Proper Scores for Diversity-Enhanced Active Learning

no code implementations • 15 Dec 2023 • Wei Tan, Lan Du, Wray Buntine

Expected Loss Reduction (ELR) focuses on a Bayesian estimate of the reduction in classification error, and more general costs fit in the same framework.

Active Learning

Paper
Add Code

Low-Resource Named Entity Recognition: Can One-vs-All AUC Maximization Help?

no code implementations • 2 Nov 2023 • Ngoc Dang Nguyen, Wei Tan, Lan Du, Wray Buntine, Richard Beare, Changyou Chen

Named entity recognition (NER), a task that identifies and categorizes named entities such as persons or organizations from text, is traditionally framed as a multi-class classification problem.

Low Resource Named Entity Recognition Meta-Learning +4

Paper
Add Code

Re-weighting Tokens: A Simple and Effective Active Learning Strategy for Named Entity Recognition

no code implementations • 2 Nov 2023 • Haocheng Luo, Wei Tan, Ngoc Dang Nguyen, Lan Du

Active learning, a widely adopted technique for enhancing machine learning models in text and image classification tasks with limited annotation resources, has received relatively little attention in the domain of Named Entity Recognition (NER).

Active Learning Image Classification +3

Paper
Add Code

Towards Generalising Neural Topical Representations

1 code implementation • 24 Jul 2023 • Xiaohao Yang, He Zhao, Dinh Phung, Lan Du

To do so, we propose to enhance NTMs by narrowing the semantical distance between similar documents, with the underlying assumption that documents from different corpora may share similar semantics.

Data Augmentation Topic Models

Paper
Code

Robust Educational Dialogue Act Classifiers with Low-Resource and Imbalanced Datasets

no code implementations • 15 Apr 2023 • Jionghao Lin, Wei Tan, Ngoc Dang Nguyen, David Lang, Lan Du, Wray Buntine, Richard Beare, Guanliang Chen, Dragan Gasevic

We note that many prior studies on classifying educational DAs employ cross entropy (CE) loss to optimize DA classifiers on low-resource data with imbalanced DA distribution.

Paper
Add Code

Does Informativeness Matter? Active Learning for Educational Dialogue Act Classification

no code implementations • 12 Apr 2023 • Wei Tan, Jionghao Lin, David Lang, Guanliang Chen, Dragan Gasevic, Lan Du, Wray Buntine

Then, the study investigates how the AL methods can select informative samples to support DA classifiers in the AL sampling process.

Active Learning Dialogue Act Classification +2

Paper
Add Code

SpeechFormer++: A Hierarchical Efficient Framework for Paralinguistic Speech Processing

1 code implementation • 27 Feb 2023 • Weidong Chen, Xiaofen Xing, Xiangmin Xu, Jianxin Pang, Lan Du

Paralinguistic speech processing is important in addressing many issues, such as sentiment and neurocognitive disorder analyses.

Alzheimer's Disease Detection Speech Emotion Recognition

Paper
Code

HiTSKT: A Hierarchical Transformer Model for Session-Aware Knowledge Tracing

no code implementations • 23 Dec 2022 • Fucai Ke, Weiqing Wang, Weicong Tan, Lan Du, Yuan Jin, Yujin Huang, Hongzhi Yin

Knowledge tracing (KT) aims to leverage students' learning histories to estimate their mastery levels on a set of pre-defined skills, based on which the corresponding future performance can be accurately predicted.

Knowledge Tracing

Paper
Add Code

AUC Maximization for Low-Resource Named Entity Recognition

no code implementations • 9 Dec 2022 • Ngoc Dang Nguyen, Wei Tan, Wray Buntine, Richard Beare, Changyou Chen, Lan Du

To the best of our knowledge, this is the first work that brings AUC maximization to the NER setting.

Low Resource Named Entity Recognition named-entity-recognition +2

Paper
Add Code

Hardness-guided domain adaptation to recognise biomedical named entities under low-resource scenarios

no code implementations • 11 Nov 2022 • Ngoc Dang Nguyen, Lan Du, Wray Buntine, Changyou Chen, Richard Beare

Domain adaptation is an effective solution to data scarcity in low-resource scenarios.

Domain Adaptation

Paper
Add Code

Learning Semantic Textual Similarity via Topic-informed Discrete Latent Variables

1 code implementation • 7 Nov 2022 • Erxin Yu, Lan Du, Yuan Jin, Zhepei Wei, Yi Chang

Recently, discrete latent variable models have received a surge of interest in both Natural Language Processing (NLP) and Computer Vision (CV), attributed to their comparable performance to the continuous counterparts in representation learning, while being more interpretable in their predictions.

Language Modelling Quantization +4

Paper
Code

Uncertainty Estimation for Multi-view Data: The Power of Seeing the Whole Picture

no code implementations • 6 Oct 2022 • Myong Chol Jung, He Zhao, Joanna Dipnall, Belinda Gabbe, Lan Du

Uncertainty estimation is essential to make neural networks trustworthy in real-world applications.

Paper
Add Code

Diversity Enhanced Active Learning with Strictly Proper Scoring Rules

1 code implementation • NeurIPS 2021 • Wei Tan, Lan Du, Wray Buntine

We convert the ELR framework to estimate the increase in (strictly proper) scores like log probability or negative mean square error, which we call Bayesian Estimate of Mean Proper Scores (BEMPS).

Active Learning text-classification +1

Paper
Code

Multilingual Neural Machine Translation:Can Linguistic Hierarchies Help?

no code implementations • 15 Oct 2021 • Fahimeh Saleh, Wray Buntine, Gholamreza Haffari, Lan Du

Multilingual Neural Machine Translation (MNMT) trains a single NMT model that supports translation between multiple languages, rather than training separate models for different languages.

Knowledge Distillation Machine Translation +2

Paper
Add Code

Transformer over Pre-trained Transformer for Neural Text Segmentation with Enhanced Topic Coherence

no code implementations • Findings (EMNLP) 2021 • Kelvin Lo, Yuan Jin, Weicong Tan, Ming Liu, Lan Du, Wray Buntine

This paper proposes a transformer over transformer framework, called Transformer$^2$, to perform neural text segmentation.

Segmentation Sentence +2

Paper
Add Code

Neural Attention-Aware Hierarchical Topic Model

no code implementations • EMNLP 2021 • Yuan Jin, He Zhao, Ming Liu, Lan Du, Wray Buntine

Neural topic models (NTMs) apply deep neural networks to topic modelling.

Sentence Topic Models

Paper
Add Code

Leveraging Information Bottleneck for Scientific Document Summarization

no code implementations • Findings (EMNLP) 2021 • Jiaxin Ju, Ming Liu, Huan Yee Koh, Yuan Jin, Lan Du, Shirui Pan

This paper presents an unsupervised extractive approach to summarize scientific long documents based on the Information Bottleneck principle.

Document Summarization Language Modelling +3

Paper
Add Code

Prototype-Guided Memory Replay for Continual Learning

no code implementations • 28 Aug 2021 • Stella Ho, Ming Liu, Lan Du, Longxiang Gao, Yong Xiang

Continual learning (CL) refers to a machine learning paradigm that learns continuously without forgetting previously acquired knowledge.

Continual Learning Meta-Learning +3

Paper
Add Code

Learning Graph Neural Networks with Positive and Unlabeled Nodes

no code implementations • 8 Mar 2021 • Man Wu, Shirui Pan, Lan Du, Xingquan Zhu

By generating multiple graphs at different distance levels, based on the adjacency matrix, we develop a long-short distance attention model to model these graphs.

Node Classification Transductive Learning

Paper
Add Code

Stratified Sampling for Extreme Multi-Label Data

1 code implementation • 5 Mar 2021 • Maximillian Merrillees, Lan Du

Extreme multi-label classification (XML) is becoming increasingly relevant in the era of big data.

Extreme Multi-Label Classification

Paper
Code

Topic Modelling Meets Deep Neural Networks: A Survey

no code implementations • 28 Feb 2021 • He Zhao, Dinh Phung, Viet Huynh, Yuan Jin, Lan Du, Wray Buntine

Topic modelling has been a successful technique for text analysis for almost twenty years.

Navigate Text Generation +1

Paper
Add Code

Collaborative Teacher-Student Learning via Multiple Knowledge Transfer

no code implementations • 21 Jan 2021 • Liyuan Sun, Jianping Gou, Baosheng Yu, Lan Du, DaCheng Tao

However, most of the existing knowledge distillation methods consider only one type of knowledge learned from either instance features or instance relations via a specific distillation strategy in teacher-student learning.

Knowledge Distillation Model Compression +2

Paper
Add Code

Multi-label Few/Zero-shot Learning with Knowledge Aggregated from Multiple Label Graphs

1 code implementation • EMNLP 2020 • Jueqing Lu, Lan Du, Ming Liu, Joanna Dipnall

Few/Zero-shot learning is a big challenge of many classifications tasks, where a classifier is required to recognise instances of classes that have very few or even no training samples.

Document Classification General Classification +3

Paper
Code

SummPip: Unsupervised Multi-Document Summarization with Sentence Graph Compression

1 code implementation • 17 Jul 2020 • Jinming Zhao, Ming Liu, Longxiang Gao, Yuan Jin, Lan Du, He Zhao, He Zhang, Gholamreza Haffari

Obtaining training data for multi-document summarization (MDS) is time consuming and resource-intensive, so recent neural models can only be trained for limited domains.

Clustering Document Summarization +2

Paper
Code

Leveraging Cross Feedback of User and Item Embeddings with Attention for Variational Autoencoder based Collaborative Filtering

no code implementations • 21 Feb 2020 • Yuan Jin, He Zhao, Ming Liu, Ye Zhu, Lan Du, Longxiang Gao, He Zhang, Yunfeng Li

Based on the ELBOs, we propose a VAE-based Bayesian MF framework.

Collaborative Filtering Recommendation Systems

Paper
Add Code

Variational Auto-encoder Based Bayesian Poisson Tensor Factorization for Sparse and Imbalanced Count Data

no code implementations • 12 Oct 2019 • Yuan Jin, Ming Liu, Yunfeng Li, Ruohua Xu, Lan Du, Longxiang Gao, Yong Xiang

Under synthetic data evaluation, VAE-BPTF tended to recover the right number of latent factors and posterior parameter values.

Paper
Add Code

Leveraging Meta Information in Short Text Aggregation

no code implementations • ACL 2019 • He Zhao, Lan Du, Guanfeng Liu, Wray Buntine

Short texts such as tweets often contain insufficient word co-occurrence information for training conventional topic models.

Clustering Topic Models

Paper
Add Code

Variational Autoencoders for Sparse and Overdispersed Discrete Data

1 code implementation • 2 May 2019 • He Zhao, Piyush Rai, Lan Du, Wray Buntine, Mingyuan Zhou

Many applications, such as text modelling, high-throughput sequencing, and recommender systems, require analysing sparse, high-dimensional, and overdispersed discrete (count-valued or binary) data.

Collaborative Filtering Multi-Label Learning +1

Paper
Code

Dirichlet belief networks for topic structure learning

2 code implementations • NeurIPS 2018 • He Zhao, Lan Du, Wray Buntine, Mingyuan Zhou

Recently, considerable research effort has been devoted to developing deep architectures for topic models to learn topic structures.

Topic Models

Paper
Code

Improving Topic Models with Latent Feature Word Representations

no code implementations • TACL 2015 • Dat Quoc Nguyen, Richard Billingsley, Lan Du, Mark Johnson

Probabilistic topic models are widely used to discover latent topics in document collections, while latent feature vector representations of words have been used to obtain high performance in many NLP tasks.

Clustering Document Classification +2

Paper
Add Code

Inter and Intra Topic Structure Learning with Word Embeddings

1 code implementation • ICML 2018 • He Zhao, Lan Du, Wray Buntine, Mingyuan Zhou

One important task of topic modeling for text analysis is interpretability.

Document Classification Word Embeddings

Paper
Code

MetaLDA: a Topic Model that Efficiently Incorporates Meta information

1 code implementation • 19 Sep 2017 • He Zhao, Lan Du, Wray Buntine, Gang Liu

Besides the text content, documents and their associated words usually come with rich sets of meta informa- tion, such as categories of documents and semantic/syntactic features of words, like those encoded in word embeddings.

Topic Models Word Embeddings

Paper
Code

Unsupervised Text Segmentation Based on Native Language Characteristics

no code implementations • ACL 2017 • Shervin Malmasi, Mark Dras, Mark Johnson, Lan Du, Magdalena Wolska

Most work on segmenting text does so on the basis of topic changes, but it can be of interest to segment by other, stylistically expressed characteristics such as change of authorship or native language.

Segmentation Text Segmentation

Paper
Add Code

Leveraging Node Attributes for Incomplete Relational Data

1 code implementation • ICML 2017 • He Zhao, Lan Du, Wray Buntine

Relational data are usually highly incomplete in practice, which inspires us to leverage side information to improve the performance of community detection and link prediction.

Community Detection Link Prediction

Paper
Code

Nonparametric Bayesian Topic Modelling with the Hierarchical Pitman-Yor Processes

no code implementations • 22 Sep 2016 • Kar Wai Lim, Wray Buntine, Changyou Chen, Lan Du

In this article, we present efficient methods for the use of these processes in this hierarchical context, and apply them to latent variable models for text analytics.

Topic Models

Paper
Add Code

Using Entity Information from a Knowledge Base to Improve Relation Extraction

no code implementations • ALTA 2015 • Lan Du, Anish Kumar, Mark Johnson, Massimiliano Ciaramita

Named Entity Recognition (NER) Relation +1

Paper
Add Code

A Computationally Efficient Algorithm for Learning Topical Collocation Models

no code implementations • IJCNLP 2015 • Zhendong Zhao, Lan Du, Benjamin B{\"o}rschinger, John K Pate, Massimiliano Ciaramita, Mark Steedman, Mark Johnson

Document Classification Information Retrieval +1

Paper
Add Code

Topic Segmentation with a Structured Topic Model

no code implementations • NAACL 2013 • Lan Du, Wray Buntine, Mark Johnson

Information Retrieval Topic Models

Paper
Add Code

Modelling Sequential Text with an Adaptive Topic Model

no code implementations • EMNLP 2012 • Lan Du, Wray Buntine, Huidong Jin

Topic Models

Paper
Add Code

A Bayesian Model for Simultaneous Image Clustering, Annotation and Object Segmentation

no code implementations • NeurIPS 2009 • Lan Du, Lu Ren, Lawrence Carin, David B. Dunson

The model clusters the images into classes, and each image is segmented into a set of objects, also allowing the opportunity to assign a word to each object (localized labeling).

Clustering Image Clustering +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.