Search Results for author: T. Tony Cai

Found 35 papers, 5 papers with code

Optimal Differentially Private PCA and Estimation for Spiked Covariance Matrices

no code implementations8 Jan 2024 T. Tony Cai, Dong Xia, Mengyue Zha

Estimating a covariance matrix and its associated principal components is a fundamental problem in contemporary statistics.

valid

Score Attack: A Lower Bound Technique for Optimal Differentially Private Learning

no code implementations13 Mar 2023 T. Tony Cai, Yichen Wang, Linjun Zhang

The score attack method is based on the tracing attack concept in differential privacy and can be applied to any statistical model with a well-defined score statistic.

Testing High-dimensional Multinomials with Applications to Text Analysis

no code implementations3 Jan 2023 T. Tony Cai, Zheng Tracy Ke, Paxton Turner

Motivated by applications in text mining and discrete distribution inference, we investigate the testing for equality of probability mass functions of $K$ groups of high-dimensional multinomial distributions.

Vocal Bursts Intensity Prediction

Transfer Learning for Contextual Multi-armed Bandits

no code implementations22 Nov 2022 Changxiao Cai, T. Tony Cai, Hongzhe Li

The results quantify the contribution of the data from the source domains for learning in the target domain in the context of nonparametric contextual multi-armed bandits.

Multi-Armed Bandits Transfer Learning

Locally Adaptive Algorithms for Multiple Testing with Network Structure, with Application to Genome-Wide Association Studies

1 code implementation22 Mar 2022 Ziyi Liang, T. Tony Cai, Wenguang Sun, Yin Xia

Linkage analysis has provided valuable insights to the GWAS studies, particularly in revealing that SNPs in linkage disequilibrium (LD) can jointly influence disease phenotypes.

Transfer Learning

Matrix Reordering for Noisy Disordered Matrices: Optimality and Computationally Efficient Algorithms

no code implementations17 Jan 2022 T. Tony Cai, Rong Ma

Motivated by applications in single-cell biology and metagenomics, we investigate the problem of matrix reordering based on a noisy disordered monotone Toeplitz matrix model.

Distributed Nonparametric Function Estimation: Optimal Rate of Convergence and Cost of Adaptation

no code implementations1 Jul 2021 T. Tony Cai, Hongji Wei

Distributed minimax estimation and distributed adaptive estimation under communication constraints for Gaussian sequence model and white noise model are studied.

Theoretical Foundations of t-SNE for Visualizing High-Dimensional Clustered Data

no code implementations16 May 2021 T. Tony Cai, Rong Ma

This paper investigates the theoretical foundations of the t-distributed stochastic neighbor embedding (t-SNE) algorithm, a popular nonlinear dimension reduction and data visualization method.

Clustering Data Visualization +2

The Cost of Privacy in Generalized Linear Models: Algorithms and Minimax Lower Bounds

no code implementations8 Nov 2020 T. Tony Cai, Yichen Wang, Linjun Zhang

We propose differentially private algorithms for parameter estimation in both low-dimensional and high-dimensional sparse generalized linear models (GLMs) by constructing private versions of projected gradient descent.

LEMMA

Estimation, Confidence Intervals, and Large-Scale Hypotheses Testing for High-Dimensional Mixed Linear Regression

no code implementations6 Nov 2020 Linjun Zhang, Rong Ma, T. Tony Cai, Hongzhe Li

Based on the iterative estimators, we further construct debiased estimators and establish their asymptotic normality.

regression

Transfer Learning in Large-scale Gaussian Graphical Models with False Discovery Rate Control

1 code implementation21 Oct 2020 Sai Li, T. Tony Cai, Hongzhe Li

Transfer learning for high-dimensional Gaussian graphical models (GGMs) is studied with the goal of estimating the target GGM by utilizing the data from similar and related auxiliary studies.

Edge Detection Transfer Learning

Transfer Learning for High-dimensional Linear Regression: Prediction, Estimation, and Minimax Optimality

1 code implementation18 Jun 2020 Sai Li, T. Tony Cai, Hongzhe Li

This paper considers the estimation and prediction of a high-dimensional linear regression in the setting of transfer learning, using samples from the target model as well as auxiliary samples from different but possibly related regression models.

regression Transfer Learning

Optimal Structured Principal Subspace Estimation: Metric Entropy and Minimax Rates

no code implementations18 Feb 2020 T. Tony Cai, Hongzhe Li, Rong Ma

Driven by a wide range of applications, many principal subspace estimation problems have been studied individually under different structural constraints.

Clustering

Distributed Gaussian Mean Estimation under Communication Constraints: Optimal Rates and Communication-Efficient Algorithms

no code implementations24 Jan 2020 T. Tony Cai, Hongji Wei

Although optimal estimation of a Gaussian mean is relatively simple in the conventional setting, it is quite involved under the communication constraints, both in terms of the optimal procedure design and lower bound argument.

High Dimensional M-Estimation with Missing Outcomes: A Semi-Parametric Framework

no code implementations26 Nov 2019 Abhishek Chakrabortty, Jiarui Lu, T. Tony Cai, Hongzhe Li

Under mild tail assumptions and arbitrarily chosen (working) models for the propensity score (PS) and the outcome regression (OR) estimators, satisfying only some high-level conditions, we establish finite sample performance bounds for the DDR estimator showing its (optimal) $L_2$ error rate to be $\sqrt{s (\log d)/ n}$ when both models are correct, and its consistency and DR properties when only one of them is correct.

Causal Inference regression +1

Sparse Group Lasso: Optimal Sample Complexity, Convergence Rate, and Statistical Inference

no code implementations21 Sep 2019 T. Tony Cai, Anru R. Zhang, Yuchen Zhou

We study sparse group Lasso for high-dimensional double sparse linear regression, where the parameter of interest is simultaneously element-wise and group-wise sparse.

regression

Transfer Learning for Nonparametric Classification: Minimax Rate and Adaptive Classifier

no code implementations7 Jun 2019 T. Tony Cai, Hongji Wei

In this paper, we study transfer learning in the context of nonparametric classification based on observations from different distributions under the posterior drift model, which is a general framework and arises in many practical problems.

Classification General Classification +1

The Cost of Privacy: Optimal Rates of Convergence for Parameter Estimation with Differential Privacy

no code implementations12 Feb 2019 T. Tony Cai, Yichen Wang, Linjun Zhang

By refining the "tracing adversary" technique for lower bounds in the theoretical computer science literature, we formulate a general lower bound argument for minimax risks with differential privacy constraints, and apply this argument to high-dimensional mean estimation and linear regression problems.

Privacy Preserving regression

Heteroskedastic PCA: Algorithm, Optimality, and Applications

1 code implementation19 Oct 2018 Anru R. Zhang, T. Tony Cai, Yihong Wu

A general framework for principal component analysis (PCA) in the presence of heteroskedastic noise is introduced.

Denoising

Weighted Message Passing and Minimum Energy Flow for Heterogeneous Stochastic Block Models with Side Information

no code implementations12 Sep 2017 T. Tony Cai, Tengyuan Liang, Alexander Rakhlin

We develop an optimally weighted message passing algorithm to reconstruct labels for SBM based on the minimum energy flow and the eigenvectors of a certain Markov transition matrix.

Community Detection

Semi-supervised Inference: General Theory and Estimation of Means

no code implementations23 Jun 2016 Anru Zhang, Lawrence D. Brown, T. Tony Cai

Estimators are proposed along with corresponding confidence intervals for the population mean.

On Detection and Structural Reconstruction of Small-World Random Networks

no code implementations21 Apr 2016 T. Tony Cai, Tengyuan Liang, Alexander Rakhlin

In this paper, we study detection and fast reconstruction of the celebrated Watts-Strogatz (WS) small-world random graph model \citep{watts1998collective} which aims to describe real-world complex networks that exhibit both high clustering and short average length properties.

Clustering

Inference via Message Passing on Partially Labeled Stochastic Block Models

no code implementations22 Mar 2016 T. Tony Cai, Tengyuan Liang, Alexander Rakhlin

We study the community detection and recovery problem in partially-labeled stochastic block models (SBM).

Community Detection

A Sparse PCA Approach to Clustering

no code implementations16 Feb 2016 T. Tony Cai, Linjun Zhang

We discuss a clustering method for Gaussian mixture model based on the sparse principal component analysis (SPCA) method and compare it with the IF-PCA method.

Clustering

Optimal Rates of Convergence for Noisy Sparse Phase Retrieval via Thresholded Wirtinger Flow

1 code implementation10 Jun 2015 T. Tony Cai, Xiao-Dong Li, Zongming Ma

This paper considers the noisy sparse phase retrieval problem: recovering a sparse signal $x \in \mathbb{R}^p$ from noisy quadratic measurements $y_j = (a_j' x )^2 + \epsilon_j$, $j=1, \ldots, m$, with independent sub-exponential noise $\epsilon_j$.

Retrieval

Structured Matrix Completion with Applications to Genomic Data Integration

no code implementations8 Apr 2015 Tianxi Cai, T. Tony Cai, Anru Zhang

Matrix completion has attracted significant recent attention in many fields including statistics, applied mathematics and electrical engineering.

Data Integration Electrical Engineering +1

Computational and Statistical Boundaries for Submatrix Localization in a Large Noisy Matrix

no code implementations6 Feb 2015 T. Tony Cai, Tengyuan Liang, Alexander Rakhlin

The second threshold, $\sf SNR_s$, captures the statistical boundary, below which no method can succeed with probability going to one in the minimax sense.

Computational Efficiency

Rate-Optimal Detection of Very Short Signal Segments

no code implementations10 Jul 2014 T. Tony Cai, Ming Yuan

Motivated by a range of applications in engineering and genomics, we consider in this paper detection of very short signal segments in three settings: signals with known shape, arbitrary signals, and smooth signals.

Robust and computationally feasible community detection in the presence of arbitrary outlier nodes

no code implementations23 Apr 2014 T. Tony Cai, Xiao-Dong Li

To the best of the authors' knowledge, our result is the first in the literature in terms of clustering communities with fast growing numbers under the GSBM where a portion of arbitrary outlier nodes exist.

Clustering Community Detection +1

Geometric Inference for General High-Dimensional Linear Inverse Problems

no code implementations17 Apr 2014 T. Tony Cai, Tengyuan Liang, Alexander Rakhlin

This paper presents a unified geometric framework for the statistical analysis of a general ill-posed linear inverse model which includes as special cases noisy compressed sensing, sign vector recovery, trace regression, orthogonal matrix estimation, and noisy matrix completion.

Matrix Completion regression +2

ROP: Matrix recovery via rank-one projections

no code implementations22 Oct 2013 T. Tony Cai, Anru Zhang

In this paper, we introduce a rank-one projection model for low-rank matrix recovery and propose a constrained nuclear norm minimization method for stable recovery of low-rank matrices in the noisy case.

Sparse Representation of a Polytope and Recovery of Sparse Signals and Low-rank Matrices

no code implementations5 Jun 2013 T. Tony Cai, Anru Zhang

It is shown that for any given constant $t\ge {4/3}$, in compressed sensing $\delta_{tk}^A < \sqrt{(t-1)/t}$ guarantees the exact recovery of all $k$ sparse signals in the noiseless case through the constrained $\ell_1$ minimization, and similarly in affine rank minimization $\delta_{tr}^\mathcal{M}< \sqrt{(t-1)/t}$ ensures the exact reconstruction of all matrices with rank at most $r$ in the noiseless case via the constrained nuclear norm minimization.

Matrix Completion via Max-Norm Constrained Optimization

no code implementations2 Mar 2013 T. Tony Cai, Wen-Xin Zhou

Matrix completion has been well studied under the uniform sampling model and the trace-norm regularized methods perform well both theoretically and numerically in such a setting.

Matrix Completion

Cannot find the paper you are looking for? You can Submit a new open access paper.