How to Improve Sample Complexity of SGD over Highly Dependent Data?

29 Sep 2021  ·  Shaocong Ma, Ziyi Chen, Yi Zhou, Kaiyi Ji, Yingbin Liang ·

Conventional machine learning applications typically assume that data samples are independently and identically distributed (i.i.d.). However, many practical scenarios naturally involve a data-generating process that produces highly dependent data samples, which are known to heavily bias the stochastic optimization process and slow down the convergence of learning. In this paper, we conduct a fundamental study on how to facilitate the convergence of SGD over highly dependent data using different popular update schemes. Specifically, with a $\phi$-mixing model that captures both exponential and polynomial decay of the data dependence over time, we show that SGD with periodic data-subsampling achieves an improved sample complexity over the standard SGD in the full spectrum of the $\phi$-mixing data dependence. Moreover, we show that by fully utilizing the data, mini-batch SGD can further substantially improve the sample complexity with highly dependent data. Numerical experiments validate our theory.

PDF Abstract
No code implementations yet. Submit your code now

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods