Search Results for author: Maxim Raginsky

Found 36 papers, 2 papers with code

Rademacher Complexity of Neural ODEs via Chen-Fliess Series

no code implementations • 30 Jan 2024 • Joshua Hanson, Maxim Raginsky

In this net, the output "weights" are taken from the signature of the control input -- a tool used to represent infinite-dimensional paths as a sequence of tensors -- which comprises iterated integrals of the control input over a simplex.

Paper
Add Code

Transformer-Based Models Are Not Yet Perfect At Learning to Emulate Structural Recursion

no code implementations • 23 Jan 2024 • Dylan Zhang, Curt Tigges, Zory Zhang, Stella Biderman, Maxim Raginsky, Talia Ringer

The framework includes a representation that captures the general \textit{syntax} of structural recursion, coupled with two different frameworks for understanding their \textit{semantics} -- one that is more natural from a programming languages perspective and one that helps bridge that perspective with a mechanistic understanding of the underlying transformer architecture.

Paper
Add Code

Generalization Bounds: Perspectives from Information Theory and PAC-Bayes

no code implementations • 8 Sep 2023 • Fredrik Hellström, Giuseppe Durisi, Benjamin Guedj, Maxim Raginsky

Over the past decades, the PAC-Bayesian approach has been established as a flexible framework to address the generalization capabilities of machine learning algorithms, and design new ones.

Generalization Bounds

Paper
Add Code

A Constructive Approach to Function Realization by Neural Stochastic Differential Equations

no code implementations • 1 Jul 2023 • Tanya Veeravalli, Maxim Raginsky

The problem of function approximation by neural dynamical systems has typically been approached in a top-down manner: Any continuous function can be approximated to an arbitrary accuracy by a sufficiently complex model with a given architecture.

Paper
Add Code

Can Transformers Learn to Solve Problems Recursively?

no code implementations • 24 May 2023 • Shizhuo Dylan Zhang, Curt Tigges, Stella Biderman, Maxim Raginsky, Talia Ringer

Neural networks have in recent years shown promise for helping software engineers write programs and even formally verify them.

Paper
Add Code

A unified framework for information-theoretic generalization bounds

no code implementations • NeurIPS 2023 • Yifeng Chu, Maxim Raginsky

This paper presents a general methodology for deriving information-theoretic generalization bounds for learning algorithms.

Generalization Bounds LEMMA

Paper
Add Code

Majorizing Measures, Codes, and Information

no code implementations • 4 May 2023 • Yifeng Chu, Maxim Raginsky

The majorizing measure theorem of Fernique and Talagrand is a fundamental result in the theory of random processes.

Paper
Add Code

A Chain Rule for the Expected Suprema of Bernoulli Processes

no code implementations • 27 Apr 2023 • Yifeng Chu, Maxim Raginsky

We obtain an upper bound on the expected supremum of a Bernoulli process indexed by the image of an index set under a uniformly Lipschitz function class in terms of properties of the index set and the function class, extending an earlier result of Maurer for Gaussian processes.

Gaussian Processes

Paper
Add Code

Variational Principles for Mirror Descent and Mirror Langevin Dynamics

no code implementations • 16 Mar 2023 • Belinda Tzen, Anant Raj, Maxim Raginsky, Francis Bach

Mirror descent, introduced by Nemirovski and Yudin in the 1970s, is a primal-dual convex optimization method that can be tailored to the geometry of the optimization problem at hand through the choice of a strongly convex potential function.

Paper
Add Code

Nonlinear controllability and function representation by neural stochastic differential equations

no code implementations • 1 Dec 2022 • Tanya Veeravalli, Maxim Raginsky

There has been a great deal of recent interest in learning and approximation of functions that can be expressed as expectations of a given nonlinearity with respect to its random internal parameters.

Motion Planning

Paper
Add Code

Fitting an immersed submanifold to data via Sussmann's orbit theorem

no code implementations • 3 Apr 2022 • Joshua Hanson, Maxim Raginsky

This paper describes an approach for fitting an immersed submanifold of a finite-dimensional Euclidean space to random samples.

Paper
Add Code

Input-to-State Stable Neural Ordinary Differential Equations with Applications to Transient Modeling of Circuits

no code implementations • 14 Feb 2022 • Alan Yang, Jie Xiong, Maxim Raginsky, Elyse Rosenbaum

This paper proposes a class of neural ordinary differential equations parametrized by provably input-to-state stable continuous-time recurrent neural networks.

Paper
Add Code

Information-theoretic generalization bounds for black-box learning algorithms

1 code implementation • NeurIPS 2021 • Hrayr Harutyunyan, Maxim Raginsky, Greg Ver Steeg, Aram Galstyan

We derive information-theoretic generalization bounds for supervised learning algorithms based on the information contained in predictions rather than in the output of the training algorithm.

Generalization Bounds

Paper
Code

Minimum Excess Risk in Bayesian Learning

no code implementations • 29 Dec 2020 • Aolin Xu, Maxim Raginsky

We analyze the best achievable performance of Bayesian learning under generative models by defining and upper-bounding the minimum excess risk (MER): the gap between the minimum expected loss attainable by learning from data and the minimum expected loss that could be achieved if the model realization were known.

Binary Classification

Paper
Add Code

Learning Recurrent Neural Net Models of Nonlinear Systems

no code implementations • 18 Nov 2020 • Joshua Hanson, Maxim Raginsky, Eduardo Sontag

We consider the following learning problem: Given sample pairs of input and output signals generated by an unknown nonlinear system (which is not assumed to be causal or time-invariant), we wish to find a continuous-time recurrent neural net with hyperbolic tangent activation function that approximately reproduces the underlying i/o behavior with high confidence.

Paper
Add Code

Universal Simulation of Stable Dynamical Systems by Recurrent Neural Nets

no code implementations • L4DC 2020 • Joshua Hanson, Maxim Raginsky

It is well-known that continuous-time recurrent neural nets are universal approximators for continuous-time dynamical systems.

Paper
Add Code

Partially Observed Discrete-Time Risk-Sensitive Mean Field Games

no code implementations • 24 Mar 2020 • Naci Saldi, Tamer Basar, Maxim Raginsky

In this paper, we consider discrete-time partially observed mean-field games with the risk-sensitive optimality criterion.

Paper
Add Code

A mean-field theory of lazy training in two-layer neural nets: entropic regularization and controlled McKean-Vlasov dynamics

no code implementations • 5 Feb 2020 • Belinda Tzen, Maxim Raginsky

We first consider the mean-field limit, where the finite population of neurons in the hidden layer is replaced by a continual ensemble, and show that our problem can be phrased as global minimization of a free-energy functional on the space of probability measures over the weights.

Paper
Add Code

Model-Augmented Estimation of Conditional Mutual Information for Feature Selection

1 code implementation • 12 Nov 2019 • Alan Yang, AmirEmad Ghassami, Maxim Raginsky, Negar Kiyavash, Elyse Rosenbaum

In the second step, CI testing is performed by applying the $k$-NN conditional mutual information estimator to the learned feature maps.

feature selection

Paper
Code

Universal Approximation of Input-Output Maps by Temporal Convolutional Nets

no code implementations • NeurIPS 2019 • Joshua Hanson, Maxim Raginsky

There has been a recent shift in sequence-to-sequence modeling from recurrent network architectures to convolutional network architectures due to computational advantages in training and operation while still achieving competitive performance.

Paper
Add Code

Neural Stochastic Differential Equations: Deep Latent Gaussian Models in the Diffusion Limit

no code implementations • 23 May 2019 • Belinda Tzen, Maxim Raginsky

In deep latent Gaussian models, the latent variable is generated by a time-inhomogeneous Markov chain, where at each time step we pass the current state through a parametric nonlinear map, such as a feedforward neural net, and add a small independent Gaussian perturbation.

Variational Inference

Paper
Add Code

Theoretical guarantees for sampling and inference in generative models with latent diffusions

no code implementations • 5 Mar 2019 • Belinda Tzen, Maxim Raginsky

We introduce and study a class of probabilistic generative models, where the latent object is a finite-dimensional diffusion process on a finite time interval and the observed variable is drawn conditionally on the terminal point of the diffusion.

Variational Inference

Paper
Add Code

Learning finite-dimensional coding schemes with nonlinear reconstruction maps

no code implementations • 23 Dec 2018 • Jaeho Lee, Maxim Raginsky

This paper generalizes the Maurer--Pontil framework of finite-dimensional lossy coding schemes to the setting where a high-dimensional random vector is mapped to an element of a compact set of latent representations in a lower-dimensional Euclidean space, and the reconstruction map belongs to a given class of nonlinear maps.

Generalization Bounds Representation Learning

Paper
Add Code

Local Optimality and Generalization Guarantees for the Langevin Algorithm via Empirical Metastability

no code implementations • 18 Feb 2018 • Belinda Tzen, Tengyuan Liang, Maxim Raginsky

For a particular local optimum of the empirical risk, with an arbitrary initialization, we show that, with high probability, at least one of the following two events will occur: (1) the Langevin trajectory ends up somewhere outside the $\varepsilon$-neighborhood of this particular optimum within a short recurrence time; (2) it enters this $\varepsilon$-neighborhood by the recurrence time and stays there until a potentially exponentially long escape time.

Paper
Add Code

Information-theoretic analysis of generalization capability of learning algorithms

no code implementations • NeurIPS 2017 • Aolin Xu, Maxim Raginsky

We derive upper bounds on the generalization error of a learning algorithm in terms of the mutual information between its input and output.

Paper
Add Code

Minimax Statistical Learning with Wasserstein Distances

no code implementations • NeurIPS 2018 • Jaeho Lee, Maxim Raginsky

As opposed to standard empirical risk minimization (ERM), distributionally robust optimization aims to minimize the worst-case risk over a larger ambiguity set containing the original empirical distribution of the training data.

Domain Adaptation Generalization Bounds

Paper
Add Code

EE-Grad: Exploration and Exploitation for Cost-Efficient Mini-Batch SGD

no code implementations • 19 May 2017 • Mehmet A. Donmez, Maxim Raginsky, Andrew C. Singer

We present a generic framework for trading off fidelity and cost in computing stochastic gradients when the costs of acquiring stochastic gradients of different quality are not known a priori.

Paper
Add Code

Non-convex learning via Stochastic Gradient Langevin Dynamics: a nonasymptotic analysis

no code implementations • 13 Feb 2017 • Maxim Raginsky, Alexander Rakhlin, Matus Telgarsky

Stochastic Gradient Langevin Dynamics (SGLD) is a popular variant of Stochastic Gradient Descent, where properly scaled isotropic Gaussian noise is added to an unbiased estimate of the gradient at each iteration.

Paper
Add Code

Coordinate Dual Averaging for Decentralized Online Optimization with Nonseparable Global Objectives

no code implementations • 31 Aug 2015 • Soomin Lee, Angelia Nedić, Maxim Raginsky

In ODA-C, to mitigate the disagreements on the primal-vector updates, the agents implement a generalization of the local information-exchange dynamics recently proposed by Li and Marden over a static undirected graph.

Paper
Add Code

Online Markov decision processes with Kullback-Leibler control cost

no code implementations • 14 Jan 2014 • Peng Guan, Maxim Raginsky, Rebecca Willett

This paper considers an online (real-time) control problem that involves an agent performing a discrete-time random walk over a finite state space.

Paper
Add Code

Relax but stay in control: from value to algorithms for online Markov decision processes

no code implementations • 28 Oct 2013 • Peng Guan, Maxim Raginsky, Rebecca Willett

Online learning algorithms are designed to perform in non-stationary environments, but generally there is no notion of a dynamic state to model constraints on current and future actions as a function of past actions.

Paper
Add Code

Online discrete optimization in social networks in the presence of Knightian uncertainty

no code implementations • 1 Jul 2013 • Maxim Raginsky, Angelia Nedić

We study a model of collective real-time decision-making (or learning) in a social network operating in an uncertain environment, for which no a priori probabilistic model is available.

Decision Making

Paper
Add Code

Concentration of Measure Inequalities in Information Theory, Communications and Coding (Second Edition)

no code implementations • 19 Dec 2012 • Maxim Raginsky, Igal Sason

This monograph focuses on some of the key modern mathematical tools that are used for the derivation of concentration inequalities, on their links to information theory, and on their various applications to communications and coding.

Information Theory Information Theory Probability

Paper
Add Code

Lower Bounds for Passive and Active Learning

no code implementations • NeurIPS 2011 • Maxim Raginsky, Alexander Rakhlin

For passive learning, our lower bounds match the upper bounds of Gine and Koltchinskii up to constants and generalize analogous results of Massart and Nedelec.

Active Learning

Paper
Add Code

Locality-sensitive binary codes from shift-invariant kernels

no code implementations • NeurIPS 2009 • Maxim Raginsky, Svetlana Lazebnik

This paper addresses the problem of designing binary codes for high-dimensional data such that vectors that are similar in the original space map to similar binary strings.

Paper
Add Code

Near-minimax recursive density estimation on the binary hypercube

no code implementations • NeurIPS 2008 • Maxim Raginsky, Svetlana Lazebnik, Rebecca Willett, Jorge Silva

This paper describes a recursive estimation procedure for multivariate binary densities using orthogonal expansions.

Density Estimation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.