no code implementations • 25 Mar 2024 • Oliver Y. Feng, Yu-Chun Kao, Min Xu, Richard J. Samworth
As an example of a non-log-concave setting, for Cauchy errors, the optimal convex loss function is Huber-like, and our procedure yields an asymptotic efficiency greater than 0. 87 relative to the oracle maximum likelihood estimator of the regression coefficients that uses knowledge of this error distribution; in this sense, we obtain robustness without sacrificing much efficiency.
no code implementations • 18 Apr 2023 • Tengyao Wang, Edgar Dobriban, Milana Gataric, Richard J. Samworth
We propose a new method for high-dimensional semi-supervised learning problems based on the careful aggregation of the results of a low-dimensional procedure applied to many axis-aligned random projections of the data.
1 code implementation • 3 Nov 2022 • Anton Rask Lundborg, Ilmun Kim, Rajen D. Shah, Richard J. Samworth
In this work we study the problem of testing the model-free null of conditional mean independence, i. e. that the conditional mean of $Y$ given $X$ and $Z$ does not depend on $X$.
no code implementations • 2 Sep 2021 • Henry W. J. Reeve, Timothy I. Cannings, Richard J. Samworth
We formulate the problem as one of constrained optimisation, where we seek a low-complexity, data-dependent selection set on which, with a guaranteed probability, the regression function is uniformly at least as large as the threshold; subject to this constraint, we would like the region to contain as much mass under the marginal feature distribution as possible.
no code implementations • 8 Jun 2021 • Henry W. J. Reeve, Timothy I. Cannings, Richard J. Samworth
In transfer learning, we wish to make inference about a target population when we have access to data both from the distribution itself, and from a different but related source distribution.
no code implementations • 26 Jan 2021 • Thomas B. Berrett, Richard J. Samworth
We present the $U$-Statistic Permutation (USP) test of independence in the context of discrete data displayed in a contingency table.
no code implementations • 5 Sep 2020 • Ashwin Pananjady, Richard J. Samworth
Motivated by models for multiway comparison data, we consider the problem of estimating a coordinate-wise isotonic function on the domain $[0, 1]^d$ from noisy observations collected on a uniform lattice, but where the design points have been permuted along each dimension.
no code implementations • 7 Mar 2020 • Yudong Chen, Tengyao Wang, Richard J. Samworth
We introduce a new method for high-dimensional, online changepoint detection in settings where a $p$-variate Gaussian data stream may undergo a change in mean.
no code implementations • 15 Jan 2020 • Thomas B. Berrett, Ioannis Kontoyiannis, Richard J. Samworth
We study the problem of independence testing given independent and identically distributed pairs taking values in a $\sigma$-finite, separable measure space.
2 code implementations • 9 Aug 2019 • Jana Janková, Rajen D. Shah, Peter Bühlmann, Richard J. Samworth
We propose a family of tests to assess the goodness-of-fit of a high-dimensional generalized linear model.
Methodology Statistics Theory Statistics Theory
no code implementations • 18 Apr 2019 • Thomas B. Berrett, Richard J. Samworth
One interesting consequence of our results is the discovery that, for certain functionals, the worst-case performance of our estimator may improve on that of the natural `oracle' estimator, which is given access to the values of the unknown densities at the observations.
no code implementations • 14 Jul 2018 • Thomas B. Berrett, Yi Wang, Rina Foygel Barber, Richard J. Samworth
Like the conditional randomization test of Cand\`es et al. (2018), our test relies on the availability of an approximation to the distribution of $X \mid Z$.
Methodology Statistics Theory Statistics Theory
no code implementations • 29 May 2018 • Timothy I. Cannings, Yingying Fan, Richard J. Samworth
One consequence of these results is that the knn and SVM classifiers are robust to imperfect training labels, in the sense that the rate of convergence of the excess risks of these classifiers remains unchanged; in fact, our theoretical and empirical results even show that in some cases, imperfect labels may improve the performance of these methods.
no code implementations • 15 Dec 2017 • Milana Gataric, Tengyao Wang, Richard J. Samworth
We introduce a new method for sparse principal component analysis, based on the aggregation of eigenvector information from carefully-selected axis-aligned random projections of the sample covariance matrix.
no code implementations • 17 Nov 2017 • Thomas B. Berrett, Richard J. Samworth
We propose a test of independence of two multivariate random vectors, given a sample from the underlying population.
no code implementations • 3 Apr 2017 • Timothy I. Cannings, Thomas B. Berrett, Richard J. Samworth
We derive a new asymptotic expansion for the global excess risk of a local-$k$-nearest neighbour classifier, where the choice of $k$ may depend upon the test point.
no code implementations • 22 Aug 2014 • Tengyao Wang, Quentin Berthet, Richard J. Samworth
In this paper, we show that, under a widely-believed assumption from computational complexity theory, there is a fundamental trade-off between statistical and computational performance in this problem.