Stochastic Optimization
282 papers with code • 12 benchmarks • 11 datasets
Stochastic Optimization is the task of optimizing certain objective functional by generating and using stochastic random variables. Usually the Stochastic Optimization is an iterative process of generating random variables that progressively finds out the minima or the maxima of the objective functional. Stochastic Optimization is usually applied in the non-convex functional spaces where the usual deterministic optimization such as linear or quadratic programming or their variants cannot be used.
Source: ASOC: An Adaptive Parameter-free Stochastic Optimization Techinique for Continuous Variables
Libraries
Use these libraries to find Stochastic Optimization models and implementationsDatasets
Latest papers
Why Do We Need Weight Decay in Modern Deep Learning?
In this work, we highlight that the role of weight decay in modern deep learning is different from its regularization effect studied in classical learning theory.
Smoothing Methods for Automatic Differentiation Across Conditional Branches
We detail the effects of the approximations made for tractability in SI and propose a novel Monte Carlo estimator that avoids the underlying assumptions by estimating the smoothed programs' gradients through a combination of AD and sampling.
Quasi-Monte Carlo for 3D Sliced Wasserstein
Monte Carlo (MC) integration has been employed as the standard approximation method for the Sliced Wasserstein (SW) distance, whose analytical expression involves an intractable expectation.
Landscape-Sketch-Step: An AI/ML-Based Metaheuristic for Surrogate Optimization Problems
In this paper, we introduce a new heuristics for global optimization in scenarios where extensive evaluations of the cost function are expensive, inaccessible, or even prohibitive.
A Distributed Data-Parallel PyTorch Implementation of the Distributed Shampoo Optimizer for Training Neural Networks At-Scale
It constructs a block-diagonal preconditioner where each block consists of a coarse Kronecker product approximation to full-matrix AdaGrad for each parameter of the neural network.
PROMISE: Preconditioned Stochastic Optimization Methods by Incorporating Scalable Curvature Estimates
This paper introduces PROMISE ($\textbf{Pr}$econditioned Stochastic $\textbf{O}$ptimization $\textbf{M}$ethods by $\textbf{I}$ncorporating $\textbf{S}$calable Curvature $\textbf{E}$stimates), a suite of sketching-based preconditioned stochastic gradient algorithms for solving large-scale convex optimization problems arising in machine learning.
Likelihood-based inference and forecasting for trawl processes: a stochastic optimization approach
In this paper, we develop the first likelihood-based methodology for the inference of real-valued trawl processes and introduce novel deterministic and probabilistic forecasting methods.
Integrating LLMs and Decision Transformers for Language Grounded Generative Quality-Diversity
Quality-Diversity is a branch of stochastic optimization that is often applied to problems from the Reinforcement Learning and control domains in order to construct repertoires of well-performing policies/skills that exhibit diversity with respect to a behavior space.
Serverless Federated AUPRC Optimization for Multi-Party Collaborative Imbalanced Data Mining
To address the above challenge, we study the serverless multi-party collaborative AUPRC maximization problem since serverless multi-party collaborative training can cut down the communications cost by avoiding the server node bottleneck, and reformulate it as a conditional stochastic optimization problem in a serverless multi-party collaborative learning setting and propose a new ServerLess biAsed sTochastic gradiEnt (SLATE) algorithm to directly optimize the AUPRC.
A stochastic optimization approach to train non-linear neural networks with a higher-order variation regularization
While the $(k, q)$-VR terms applied to general parametric models are computationally intractable due to the integration, this study provides a stochastic optimization algorithm, that can efficiently train general models with the $(k, q)$-VR without conducting explicit numerical integration.