Search Results for author: Sridhar Mahadevan

Found 32 papers, 4 papers with code

GAIA: Categorical Foundations of Generative AI

no code implementations • 28 Feb 2024 • Sridhar Mahadevan

In this paper, we propose GAIA, a generative AI architecture based on category theory.

Paper
Add Code

Zero-th Order Algorithm for Softmax Attention Optimization

no code implementations • 17 Jul 2023 • Yichuan Deng, Zhihang Li, Sridhar Mahadevan, Zhao Song

We demonstrate the convergence of our algorithm, highlighting its effectiveness in efficiently computing gradients for large-scale LLMs.

Paper
Add Code

Randomized and Deterministic Attention Sparsification Algorithms for Over-parameterized Feature Dimension

no code implementations • 10 Apr 2023 • Yichuan Deng, Sridhar Mahadevan, Zhao Song

It runs in $\widetilde{O}(\mathrm{nnz}(X) + n^{\omega} ) $ time, has $1-\delta$ succeed probability, and chooses $m = O(n \log(n/\delta))$.

Sentence

Paper
Add Code

An Over-parameterized Exponential Regression

no code implementations • 29 Mar 2023 • Yeqi Gao, Sridhar Mahadevan, Zhao Song

Mathematically, we define the neural function $F: \mathbb{R}^{d \times m} \times \mathbb{R}^d \rightarrow \mathbb{R}$ using an exponential activation function.

regression

Paper
Add Code

A Layered Architecture for Universal Causality

no code implementations • 18 Dec 2022 • Sridhar Mahadevan

At the second layer, causal models are defined by a graph-type category.

Causal Inference LEMMA

Paper
Add Code

Privacy Aware Experiments without Cookies

no code implementations • 3 Nov 2022 • Shiv Shankar, Ritwik Sinha, Saayan Mitra, Viswanathan Swaminathan, Sridhar Mahadevan, Moumita Sinha

We propose a two-stage experimental design, where the two brands only need to agree on high-level aggregate parameters of the experiment to test the alternate experiences.

Experimental Design valid

Paper
Add Code

Unifying Causal Inference and Reinforcement Learning using Higher-Order Category Theory

no code implementations • 13 Sep 2022 • Sridhar Mahadevan

We present a unified formalism for structure discovery of causal models and predictive state representation (PSR) models in reinforcement learning (RL) using higher-order category theory.

Causal Inference reinforcement-learning +1

Paper
Add Code

Categoroids: Universal Conditional Independence

no code implementations • 23 Aug 2022 • Sridhar Mahadevan

Categoroids are defined as a hybrid of two categories: one encoding a preordered lattice structure defined by objects and arrows between them; the second dual parameterization involves trigonoidal objects and morphisms defining a conditional independence structure, with bridge morphisms providing the interface between the binary and ternary structures.

Causal Inference

Paper
Add Code

On The Universality of Diagrams for Causal Inference and The Causal Reproducing Property

no code implementations • 6 Jul 2022 • Sridhar Mahadevan

The second result, the Causal Reproducing Property (CRP), states that any causal influence of a object X on another object Y is representable as a natural transformation between two abstract causal diagrams.

Causal Inference LEMMA

Paper
Add Code

Smoothed Online Combinatorial Optimization Using Imperfect Predictions

no code implementations • 23 Apr 2022 • Kai Wang, Zhao Song, Georgios Theocharous, Sridhar Mahadevan

Smoothed online combinatorial optimization considers a learner who repeatedly chooses a combinatorial decision to minimize an unknown changing cost function with a penalty on switching decisions in consecutive rounds.

Combinatorial Optimization

Paper
Add Code

Universal Decision Models

no code implementations • 28 Oct 2021 • Sridhar Mahadevan

Decision objects in a UDM correspond to instances of decision tasks, ranging from causal models and dynamical systems such as Markov decision processes and predictive state representations, to network multiplayer games and Witsenhausen's intrinsic models, which generalizes all these previous formalisms.

Causal Inference

Paper
Add Code

Asymptotic Causal Inference

no code implementations • 20 Sep 2021 • Sridhar Mahadevan

Semantic entropy quantifies the reduction in entropy where edges are removed by causal intervention.

Causal Inference Experimental Design

Paper
Add Code

Causal Homotopy

no code implementations • 20 Sep 2021 • Sridhar Mahadevan

Second, a diverse range ofgraphical models used to represent causal structures can be represented in a unified way in terms of a topological representation of the induced poset structure.

Causal Discovery

Paper
Add Code

Causal Inference in Network Economics

no code implementations • 20 Sep 2021 • Sridhar Mahadevan

Network economics is the study of a rich class of equilibrium problems that occur in the real world, from traffic management to supply chains and two-sided online marketplaces.

Causal Inference Management

Paper
Add Code

Multiscale Manifold Warping

no code implementations • 19 Sep 2021 • Sridhar Mahadevan, Anup Rao, Georgios Theocharous, Jennifer Healey

Many real-world applications require aligning two temporal sequences, including bioinformatics, handwriting recognition, activity recognition, and human-robot coordination.

Activity Recognition Dynamic Time Warping +2

Paper
Add Code

Finite-Sample Analysis of Proximal Gradient TD Algorithms

no code implementations • 6 Jun 2020 • Bo Liu, Ji Liu, Mohammad Ghavamzadeh, Sridhar Mahadevan, Marek Petrik

In this paper, we analyze the convergence rate of the gradient temporal difference learning (GTD) family of algorithms.

Paper
Add Code

Regularized Off-Policy TD-Learning

no code implementations • NeurIPS 2012 • Bo Liu, Sridhar Mahadevan, Ji Liu

We present a novel $l_1$ regularized off-policy convergent TD-learning method (termed RO-TD), which is able to learn sparse representations of value functions with low computational complexity.

feature selection

Paper
Add Code

Proximal Gradient Temporal Difference Learning: Stable Reinforcement Learning with Polynomial Sample Complexity

1 code implementation • 6 Jun 2020 • Bo Liu, Ian Gemp, Mohammad Ghavamzadeh, Ji Liu, Sridhar Mahadevan, Marek Petrik

In this paper, we introduce proximal gradient temporal difference learning, which provides a principled way of designing and analyzing true stochastic gradient temporal difference learning algorithms.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Optimizing for the Future in Non-Stationary MDPs

1 code implementation • ICML 2020 • Yash Chandak, Georgios Theocharous, Shiv Shankar, Martha White, Sridhar Mahadevan, Philip S. Thomas

Most reinforcement learning methods are based upon the key assumption that the transition dynamics and reward functions are fixed, that is, the underlying Markov decision process is stationary.

Paper
Code

Global Convergence to the Equilibrium of GANs using Variational Inequalities

no code implementations • 4 Aug 2018 • Ian Gemp, Sridhar Mahadevan

In optimization, the negative gradient of a function denotes the direction of steepest descent.

Generative Adversarial Network

Paper
Add Code

A Unified Framework for Domain Adaptation using Metric Learning on Manifolds

1 code implementation • 28 Apr 2018 • Sridhar Mahadevan, Bamdev Mishra, Shalini Ghosh

We present a novel framework for domain adaptation, whereby both geometric and statistical differences between a labeled source domain and unlabeled target domain can be integrated by exploiting the curved Riemannian geometry of statistical manifolds.

Domain Adaptation Metric Learning +1

Paper
Code

Online Monotone Games

no code implementations • 19 Oct 2017 • Ian Gemp, Sridhar Mahadevan

Algorithmic game theory (AGT) focuses on the design and analysis of algorithms for interacting agents, with interactions rigorously formalized within the framework of games.

Reinforcement Learning (RL)

Paper
Add Code

A Manifold Approach to Learning Mutually Orthogonal Subspaces

no code implementations • 8 Mar 2017 • Stephen Giguere, Francisco Garcia, Sridhar Mahadevan

Although many machine learning algorithms involve learning subspaces with particular characteristics, optimizing a parameter matrix that is constrained to represent a subspace can be challenging.

Domain Adaptation Riemannian optimization

Paper
Add Code

Generative Multi-Adversarial Networks

1 code implementation • 5 Nov 2016 • Ishan Durugkar, Ian Gemp, Sridhar Mahadevan

Generative adversarial networks (GANs) are a framework for producing a generative model by way of a two-player minimax game.

Ranked #67 on Image Generation on CIFAR-10 (Inception score metric)

Image Generation

Paper
Code

Online Monotone Optimization

no code implementations • 29 Aug 2016 • Ian Gemp, Sridhar Mahadevan

This paper presents a new framework for analyzing and designing no-regret algorithms for dynamic (possibly adversarial) systems.

Paper
Add Code

Inverting Variational Autoencoders for Improved Generative Accuracy

no code implementations • 21 Aug 2016 • Ian Gemp, Ishan Durugkar, Mario Parente, M. Darby Dyar, Sridhar Mahadevan

Recent advances in semi-supervised learning with deep generative models have shown promise in generalizing from small labeled datasets ($\mathbf{x},\mathbf{y}$) to large unlabeled ones ($\mathbf{x}$).

Paper
Add Code

Deep Reinforcement Learning With Macro-Actions

no code implementations • 15 Jun 2016 • Ishan P. Durugkar, Clemens Rosenbaum, Stefan Dernbach, Sridhar Mahadevan

Deep reinforcement learning has been shown to be a powerful framework for learning policies from complex high-dimensional sensory inputs to actions in complex tasks, such as the Atari domain.

Atari Games reinforcement-learning +1

Paper
Add Code

Efficient Hyper-parameter Optimization for NLP Applications

no code implementations • EMNLP 2015 • Lidan Wang, Minwei Feng, Bo-Wen Zhou, Bing Xiang, Sridhar Mahadevan

Paper
Add Code

Reasoning about Linguistic Regularities in Word Embeddings using Matrix Manifolds

no code implementations • 28 Jul 2015 • Sridhar Mahadevan, Sarath Chandar

In this paper, we introduce a new approach to capture analogies in continuous word representations, based on modeling not just individual word vectors, but rather the subspaces spanned by groups of words.

Word Embeddings

Paper
Add Code

Proximal Reinforcement Learning: A New Theory of Sequential Decision Making in Primal-Dual Spaces

no code implementations • 26 May 2014 • Sridhar Mahadevan, Bo Liu, Philip Thomas, Will Dabney, Steve Giguere, Nicholas Jacek, Ian Gemp, Ji Liu

In this paper, we set forth a new vision of reinforcement learning developed by us over the past few years, one that yields mathematically rigorous solutions to longstanding important questions that have remained unresolved: (i) how to design reliable, convergent, and robust reinforcement learning algorithms (ii) how to guarantee that reinforcement learning satisfies pre-specified "safety" guarantees, and remains in a stable region of the parameter space (iii) how to design "off-policy" temporal difference learning algorithms in a reliable and stable manner, and finally (iv) how to integrate the study of reinforcement learning into the rich theory of stochastic optimization.

Decision Making reinforcement-learning +2

Paper
Add Code

Projected Natural Actor-Critic

no code implementations • NeurIPS 2013 • Philip S. Thomas, William C. Dabney, Stephen Giguere, Sridhar Mahadevan

Natural actor-critics are a popular class of policy search algorithms for finding locally optimal policies for Markov decision processes.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Basis Construction from Power Series Expansions of Value Functions

no code implementations • NeurIPS 2010 • Sridhar Mahadevan, Bo Liu

This paper explores links between basis construction methods in Markov decision processes and power series expansions of value functions.

Unity

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.