no code implementations • 27 Nov 2023 • Thomas Chen
In the overparametrized case, we prove that, provided that a rank condition holds, all orbits of the modified gradient descent drive the ${\mathcal L}^2$ cost to its global minimum at a uniform exponential convergence rate; one thereby obtains an a priori stopping time for any prescribed proximity to the global minimum.
no code implementations • 13 Nov 2023 • Thomas Chen, Patricia Muñoz Ewald
We analyze geometric aspects of the gradient descent algorithm in Deep Learning (DL) networks.
no code implementations • 19 Sep 2023 • Thomas Chen, Patricia Muñoz Ewald
In this paper, we explicitly determine local and global minimizers of the $\mathcal{L}^2$ cost function in underparametrized Deep Learning (DL) networks; our main goal is to shed light on their geometric structure and properties.
no code implementations • 19 Sep 2023 • Thomas Chen, Patricia Muñoz Ewald
In this paper, we approach the problem of cost (loss) minimization in underparametrized shallow neural networks through the explicit construction of upper bounds, without any use of gradient descent.
no code implementations • 21 Aug 2021 • Daniel Li, Thomas Chen, Albert Tung, Lydia Chilton
These concerns all demonstrate the need for a distinctly speech tailored interactive system to help users understand and navigate the spoken language domain.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +4