no code implementations • 6 Oct 2023 • Fangshuo Liao, Junhyung Lyle Kim, Cruz Barnum, Anastasios Kyrillidis
Principal Component Analysis (PCA) is a popular tool in data analysis, especially when the data is high-dimensional.
no code implementations • 13 Jun 2023 • Fangshuo Liao, Anastasios Kyrillidis
Current state-of-the-art analyses on the convergence of gradient descent for training neural networks focus on characterizing properties of the loss landscape, such as the Polyak-Lojaciewicz (PL) condition and the restricted strong convexity.
no code implementations • 29 Oct 2022 • Zheyang Xiong, Fangshuo Liao, Anastasios Kyrillidis
The strong Lottery Ticket Hypothesis (LTH) claims the existence of a subnetwork in a sufficiently large, randomly initialized neural network that approximates some target neural network without the need of training.
no code implementations • 28 Oct 2022 • Qihan Wang, Chen Dun, Fangshuo Liao, Chris Jermaine, Anastasios Kyrillidis
\textsc{LoFT} is a model-parallel pretraining algorithm that partitions convolutional layers by filters to train them independently in a distributed setting, resulting in reduced memory and communication costs during pretraining.
no code implementations • 5 Dec 2021 • Fangshuo Liao, Anastasios Kyrillidis
With the motive of training all the parameters of a neural network, we study why and when one can achieve this by iteratively creating, training, and combining randomly selected subnetworks.
no code implementations • 31 Jul 2021 • Cameron R. Wolfe, Fangshuo Liao, Qihan Wang, Junhyung Lyle Kim, Anastasios Kyrillidis
Aiming to mathematically analyze the amount of dense network pre-training needed for a pruned network to perform well, we discover a simple theoretical bound in the number of gradient descent pre-training iterations on a two-layer, fully-connected network, beyond which pruning via greedy forward selection [61] yields a subnetwork that achieves good training error.