Search Results for author: Donghui Yan

Found 12 papers, 0 papers with code

A Deep Neural Network Based Approach to Building Budget-Constrained Models for Big Data Analysis

no code implementations • 23 Feb 2023 • Rui Ming, Haiping Xu, Shannon E. Gibbs, Donghui Yan, Ming Shao

Deep learning approaches require collection of data on many different input features or variables for accurate model training and prediction.

Paper
Add Code

Improving Short Text Classification With Augmented Data Using GPT-3

no code implementations • 23 May 2022 • Salvador Balkus, Donghui Yan

This study compares two classifiers: the GPT-3 Classification Endpoint with augmented examples, and the GPT-3 Completion Endpoint with an optimal training set chosen using a genetic algorithm.

Language Modelling text-classification +2

Paper
Add Code

Learning Low-dimensional Manifolds for Scoring of Tissue Microarray Images

no code implementations • 22 Feb 2021 • Donghui Yan, Jian Zou, Zhenpeng Li

Inspired by the recent advance in semi-supervised learning and deep learning, we propose mfTacoma to learn alternative deep representations in the context of TMA image scoring.

Representation Learning

Paper
Add Code

Estimating the Number of Infected Cases in COVID-19 Pandemic

no code implementations • 24 May 2020 • Donghui Yan, Ying Xu, Pei Wang

We propose a structured approach for the estimation of the number of unreported cases, where we distinguish cases that arrive late in the reported numbers and those who had mild or no symptoms and thus were not captured by any medical system at all.

Paper
Add Code

$DC^2$: A Divide-and-conquer Algorithm for Large-scale Kernel Learning with Application to Clustering

no code implementations • 16 Nov 2019 • Ke Alexander Wang, Xinran Bian, Pan Liu, Donghui Yan

Analysis on $DC^2$ when applied to spectral clustering shows that the loss in clustering accuracy due to data division and reduction is upper bounded by the data approximation error which would vanish with recursive random projections.

Clustering

Paper
Add Code

Similarity Kernel and Clustering via Random Projection Forests

no code implementations • 28 Aug 2019 • Donghui Yan, Songxiang Gu, Ying Xu, Zhiwei Qin

Similarity plays a fundamental role in many areas, including data mining, machine learning, statistics and various applied domains.

Clustering Clustering Ensemble

Paper
Add Code

Learning over inherently distributed data

no code implementations • 30 Jul 2019 • Donghui Yan, Ying Xu

This framework only requires a small amount of local signatures to be shared among distributed sites, eliminating the need of having to transmitting big data.

Distributed Computing

Paper
Add Code

Fast communication-efficient spectral clustering over distributed data

no code implementations • 5 May 2019 • Donghui Yan, Yingjie Wang, Jin Wang, Guodong Wu, Honggang Wang

However, it is increasingly often that the data are located at a number of distributed sites, and one wishes to compute over all the data with low communication overhead.

Clustering Distributed Computing

Paper
Add Code

Cost-sensitive Selection of Variables by Ensemble of Model Sequences

no code implementations • 2 Jan 2019 • Donghui Yan, Zhiwei Qin, Songxiang Gu, Haiping Xu, Ming Shao

Many applications require the collection of data on different variables or measurements over many system performance metrics.

Paper
Add Code

K-nearest Neighbor Search by Random Projection Forests

no code implementations • 31 Dec 2018 • Donghui Yan, Yingjie Wang, Jin Wang, Honggang Wang, Zhenpeng Li

Our theory can be used to refine the choice of random projections in the growth of trees, and experiments show that the effect is remarkable.

Paper
Add Code

Cluster Forests

no code implementations • 14 Apr 2011 • Donghui Yan, Aiyou Chen, Michael. I. Jordan

The search for good local clusterings is guided by a cluster quality measure kappa.

Clustering Clustering Ensemble

Paper
Add Code

Spectral Clustering with Perturbed Data

no code implementations • NeurIPS 2008 • Ling Huang, Donghui Yan, Nina Taft, Michael. I. Jordan

We show that the error under perturbation of spectral clustering is closely related to the perturbation of the eigenvectors of the Laplacian matrix.

Clustering Quantization

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.