Search Results for author: Donghui Yan

Found 12 papers, 0 papers with code

A Deep Neural Network Based Approach to Building Budget-Constrained Models for Big Data Analysis

no code implementations23 Feb 2023 Rui Ming, Haiping Xu, Shannon E. Gibbs, Donghui Yan, Ming Shao

Deep learning approaches require collection of data on many different input features or variables for accurate model training and prediction.

Improving Short Text Classification With Augmented Data Using GPT-3

no code implementations23 May 2022 Salvador Balkus, Donghui Yan

This study compares two classifiers: the GPT-3 Classification Endpoint with augmented examples, and the GPT-3 Completion Endpoint with an optimal training set chosen using a genetic algorithm.

Language Modelling text-classification +2

Learning Low-dimensional Manifolds for Scoring of Tissue Microarray Images

no code implementations22 Feb 2021 Donghui Yan, Jian Zou, Zhenpeng Li

Inspired by the recent advance in semi-supervised learning and deep learning, we propose mfTacoma to learn alternative deep representations in the context of TMA image scoring.

Representation Learning

Estimating the Number of Infected Cases in COVID-19 Pandemic

no code implementations24 May 2020 Donghui Yan, Ying Xu, Pei Wang

We propose a structured approach for the estimation of the number of unreported cases, where we distinguish cases that arrive late in the reported numbers and those who had mild or no symptoms and thus were not captured by any medical system at all.

$DC^2$: A Divide-and-conquer Algorithm for Large-scale Kernel Learning with Application to Clustering

no code implementations16 Nov 2019 Ke Alexander Wang, Xinran Bian, Pan Liu, Donghui Yan

Analysis on $DC^2$ when applied to spectral clustering shows that the loss in clustering accuracy due to data division and reduction is upper bounded by the data approximation error which would vanish with recursive random projections.

Clustering

Similarity Kernel and Clustering via Random Projection Forests

no code implementations28 Aug 2019 Donghui Yan, Songxiang Gu, Ying Xu, Zhiwei Qin

Similarity plays a fundamental role in many areas, including data mining, machine learning, statistics and various applied domains.

Clustering Clustering Ensemble

Learning over inherently distributed data

no code implementations30 Jul 2019 Donghui Yan, Ying Xu

This framework only requires a small amount of local signatures to be shared among distributed sites, eliminating the need of having to transmitting big data.

Distributed Computing

Fast communication-efficient spectral clustering over distributed data

no code implementations5 May 2019 Donghui Yan, Yingjie Wang, Jin Wang, Guodong Wu, Honggang Wang

However, it is increasingly often that the data are located at a number of distributed sites, and one wishes to compute over all the data with low communication overhead.

Clustering Distributed Computing

Cost-sensitive Selection of Variables by Ensemble of Model Sequences

no code implementations2 Jan 2019 Donghui Yan, Zhiwei Qin, Songxiang Gu, Haiping Xu, Ming Shao

Many applications require the collection of data on different variables or measurements over many system performance metrics.

K-nearest Neighbor Search by Random Projection Forests

no code implementations31 Dec 2018 Donghui Yan, Yingjie Wang, Jin Wang, Honggang Wang, Zhenpeng Li

Our theory can be used to refine the choice of random projections in the growth of trees, and experiments show that the effect is remarkable.

Cluster Forests

no code implementations14 Apr 2011 Donghui Yan, Aiyou Chen, Michael. I. Jordan

The search for good local clusterings is guided by a cluster quality measure kappa.

Clustering Clustering Ensemble

Spectral Clustering with Perturbed Data

no code implementations NeurIPS 2008 Ling Huang, Donghui Yan, Nina Taft, Michael. I. Jordan

We show that the error under perturbation of spectral clustering is closely related to the perturbation of the eigenvectors of the Laplacian matrix.

Clustering Quantization

Cannot find the paper you are looking for? You can Submit a new open access paper.