Speeding up PCA with priming

8 Sep 2021 · Bálint Máté, François Fleuret ·

We introduce primed-PCA (pPCA), a two-step algorithm for speeding up the approximation of principal components. This algorithm first runs any approximate-PCA method to get an initial estimate of the principal components (priming), and then applies an exact PCA in the subspace they span. Since this subspace is of small dimension in any practical use, the second step is extremely cheap computationally. Nonetheless, it improves accuracy significantly for a given computational budget across datasets. In this setup, the purpose of the priming is to narrow down the search space, and prepare the data for the second step, an exact calculation. We show formally that pPCA improves upon the priming algorithm under very mild conditions, and we provide experimental validation on both synthetic and real large-scale datasets showing that it systematically translates to improved performance. In our experiments we prime pPCA by several approximate algorithms and report an average speedup by a factor of 7.2 over Oja's rule, and a factor of 10.5 over EigenGame.

PDF Abstract