no code implementations • 4 May 2021 • Sajad Khodadadian, Prakirt Raj Jhunjhunwala, Sushil Mahavir Varma, Siva Theja Maguluri
We further improve this convergence result by introducing a variant of Natural Policy Gradient with adaptive step sizes.
no code implementations • 9 Aug 2020 • Sushil Mahavir Varma, Francisco Castro, Siva Theja Maguluri
We then study the system under a large market regime in which the arrival rates are scaled by $\eta$ and present a probabilistic two-price policy and a max-weight matching policy which results in a net profit-loss of at most $O(\eta^{1/3})$.
Optimization and Control Probability