Search Results for author: Sekhar Tatikonda

Found 14 papers, 5 papers with code

Surrogate Gap Minimization Improves Sharpness-Aware Training

1 code implementation ICLR 2022 Juntang Zhuang, Boqing Gong, Liangzhe Yuan, Yin Cui, Hartwig Adam, Nicha Dvornek, Sekhar Tatikonda, James Duncan, Ting Liu

Instead, we define a \textit{surrogate gap}, a measure equivalent to the dominant eigenvalue of Hessian at a local minimum when the radius of the neighborhood (to derive the perturbed loss) is small.

Momentum Centering and Asynchronous Update for Adaptive Gradient Methods

2 code implementations NeurIPS 2021 Juntang Zhuang, Yifan Ding, Tommy Tang, Nicha Dvornek, Sekhar Tatikonda, James S. Duncan

We demonstrate that ACProp has a convergence rate of $O(\frac{1}{\sqrt{T}})$ for the stochastic non-convex case, which matches the oracle rate and outperforms the $O(\frac{logT}{\sqrt{T}})$ rate of RMSProp and Adam.

Image Classification

Multiple-shooting adjoint method for whole-brain dynamic causal modeling

no code implementations14 Feb 2021 Juntang Zhuang, Nicha Dvornek, Sekhar Tatikonda, Xenophon Papademetris, Pamela Ventola, James Duncan

Furthermore, MSA uses the adjoint method for accurate gradient estimation in the ODE; since the adjoint method is generic, MSA is a generic method for both linear and non-linear systems, and does not require re-derivation of the algorithm as in EM.

AdaBelief Optimizer: Adapting Stepsizes by the Belief in Observed Gradients

8 code implementations NeurIPS 2020 Juntang Zhuang, Tommy Tang, Yifan Ding, Sekhar Tatikonda, Nicha Dvornek, Xenophon Papademetris, James S. Duncan

Viewing the exponential moving average (EMA) of the noisy gradient as the prediction of the gradient at the next time step, if the observed gradient greatly deviates from the prediction, we distrust the current observation and take a small step; if the observed gradient is close to the prediction, we trust it and take a large step.

Image Classification Language Modelling

Adaptive Checkpoint Adjoint Method for Gradient Estimation in Neural ODE

2 code implementations ICML 2020 Juntang Zhuang, Nicha Dvornek, Xiaoxiao Li, Sekhar Tatikonda, Xenophon Papademetris, James Duncan

Neural ordinary differential equations (NODEs) have recently attracted increasing attention; however, their empirical performance on benchmark tasks (e. g. image classification) are significantly inferior to discrete-layer models.

General Classification Image Classification +2

Zero-shot Transfer Learning for Semantic Parsing

no code implementations27 Aug 2018 Javid Dadashkarimi, Alexander Fabbri, Sekhar Tatikonda, Dragomir R. Radev

In this paper we propose to use feature transfer in a zero-shot experimental setting on the task of semantic parsing.

Semantic Parsing Transfer Learning

Sequence to Logic with Copy and Cache

no code implementations19 Jul 2018 Javid Dadashkarimi, Sekhar Tatikonda

Generating logical form equivalents of human language is a fresh way to employ neural architectures where long short-term memory effectively captures dependencies in both encoder and decoder units.

A New Approach to Laplacian Solvers and Flow Problems

no code implementations22 Nov 2016 Patrick Rebeschini, Sekhar Tatikonda

This paper investigates the behavior of the Min-Sum message passing scheme to solve systems of linear equations in the Laplacian matrices of graphs and to compute electric flows.

Scale-free network optimization: foundations and algorithms

no code implementations12 Feb 2016 Patrick Rebeschini, Sekhar Tatikonda

We propose a notion of correlation in constrained optimization that is based on the sensitivity of the optimal solution upon perturbations of the constraints.

Lossy Compression via Sparse Linear Regression: Computationally Efficient Encoding and Decoding

no code implementations7 Dec 2012 Ramji Venkataramanan, Tuhin Sarkar, Sekhar Tatikonda

The proposed encoding algorithm sequentially chooses columns of the design matrix to successively approximate the source sequence.

regression

Lossy Compression via Sparse Linear Regression: Performance under Minimum-distance Encoding

no code implementations3 Feb 2012 Ramji Venkataramanan, Antony Joseph, Sekhar Tatikonda

We study a new class of codes for lossy compression with the squared-error distortion criterion, designed using the statistical framework of high-dimensional linear regression.

regression

Cannot find the paper you are looking for? You can Submit a new open access paper.