An adaptive multiclass nearest neighbor classifier

8 Apr 2018  ·  Nikita Puchkin, Vladimir Spokoiny ·

We consider a problem of multiclass classification, where the training sample $S_n = \{(X_i, Y_i)\}_{i=1}^n$ is generated from the model $\mathbb P(Y = m | X = x) = \eta_m(x)$, $1 \leq m \leq M$, and $\eta_1(x), \dots, \eta_M(x)$ are unknown $\alpha$-Holder continuous functions.Given a test point $X$, our goal is to predict its label. A widely used $\mathsf k$-nearest-neighbors classifier constructs estimates of $\eta_1(X), \dots, \eta_M(X)$ and uses a plug-in rule for the prediction. However, it requires a proper choice of the smoothing parameter $\mathsf k$, which may become tricky in some situations. In our solution, we fix several integers $n_1, \dots, n_K$, compute corresponding $n_k$-nearest-neighbor estimates for each $m$ and each $n_k$ and apply an aggregation procedure. We study an algorithm, which constructs a convex combination of these estimates such that the aggregated estimate behaves approximately as well as an oracle choice. We also provide a non-asymptotic analysis of the procedure, prove its adaptation to the unknown smoothness parameter $\alpha$ and to the margin and establish rates of convergence under mild assumptions.

PDF Abstract
No code implementations yet. Submit your code now

Tasks


Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here