no code implementations • 28 Jun 2023 • Ningyuan Chen, Wenhao Li
We consider a decision maker allocating one unit of renewable and divisible resource in each period on a number of arms.
no code implementations • 20 Nov 2022 • Ningyuan Chen, Ming Hu, Wenhao Li
In view of such a conflict, we provide a general analytical framework to study the augmentation of algorithmic decisions with human knowledge: the analyst uses the knowledge to set a guardrail by which the algorithmic decision is clipped if the algorithmic output is out of bound, and seems unreasonable.
no code implementations • 11 Sep 2022 • Ningyuan Chen, Setareh Farajollahzadeh, Guan Wang
In this paper, we propose an approach to learn the distribution of consumers' valuations toward the products using bundle sales data.
no code implementations • 5 Jan 2022 • Ningyuan Chen, Shuoguang Yang, Hailun Zhang
In the multi-armed bandit framework, there are two formulations that are commonly employed to handle time-varying reward distributions: adversarial bandit and nonstationary bandit.
no code implementations • 31 Jul 2021 • Ningyuan Chen, Xuefeng Gao, Yi Xiong
It has been recently shown in the literature that the sample averages from online learning experiments are biased when used to estimate the mean reward.
no code implementations • 8 Jul 2021 • Yi Xiong, Ningyuan Chen, Xuefeng Gao, Xiang Zhou
We study the model-based undiscounted reinforcement learning for partially observable Markov decision processes (POMDPs).
no code implementations • 10 Jun 2021 • Tianyu Wang, Ningyuan Chen, Chun Wang
In prescriptive analytics, the decision-maker observes historical samples of $(X, Y)$, where $Y$ is the uncertain problem parameter and $X$ is the concurrent covariate, without knowing the joint distribution.
no code implementations • NeurIPS 2021 • Ningyuan Chen
We consider the continuum-armed bandit problem when the arm sequence is required to be monotone.
no code implementations • 7 Dec 2020 • Ningyuan Chen, Anran Li, Shuoguang Yang
When the conditional purchase probabilities are not known and may depend on consumer and product features, we devise an online learning algorithm that achieves $\tilde{\mathcal{O}}(\sqrt{T})$ regret relative to the approximation algorithm, despite the censoring of information: the attention span of a customer who purchases an item is not observable.
no code implementations • 17 Sep 2020 • Wenhao Li, Ningyuan Chen, L. Jeff Hong
Our algorithm achieves the regret $\tilde{O}(T^{(d_x^*+d_y+1)/(d_x^*+d_y+2)})$, where $d_x^*$ is the effective covariate dimension.
no code implementations • 16 May 2020 • Ningyuan Chen, Chun Wang, Longlin Wang
We show that our learning policy incurs a regret upper bound $\tilde{O}(\sqrt{T\sum_{k=1}^K T_k})$ where $T_k$ is the period of arm $k$.
no code implementations • NeurIPS 2021 • Xiang Zhou, Yi Xiong, Ningyuan Chen, Xuefeng Gao
We study a multi-armed bandit problem where the rewards exhibit regime switching.
no code implementations • 3 Aug 2019 • Ningyuan Chen, Guillermo Gallego, Zhuodong Tang
We also prove that the random forest can recover preference rankings of customers thanks to the splitting criterion such as the Gini index and information gain ratio.
no code implementations • 15 Jul 2019 • Wenhao Li, Ningyuan Chen, L. Jeff Hong
The literature has shown that for Lipschitz-continuous functions, the optimal regret is $\tilde{O}(T^{\frac{d_x+d_y+1}{d_x+d_y+2}})$, where $d_x$ and $d_y$ are the dimensions of contexts and arms, and thus suffers from the curse of dimensionality.
no code implementations • 20 Dec 2018 • Ningyuan Chen, Guillermo Gallego
We consider the problem of a firm seeking to use personalized pricing to sell an exogenously given stock of a product over a finite selling horizon to different consumer types.
no code implementations • 3 May 2018 • Ningyuan Chen, Guillermo Gallego
We propose a nonparametric pricing policy to simultaneously learn the preference of customers based on the covariates and maximize the expected revenue over a finite horizon.
no code implementations • 27 Jan 2017 • Donald K. K. Lee, Ningyuan Chen, Hemant Ishwaran
Given functional data from a survival process with time-dependent covariates, we derive a smooth convex representation for its nonparametric log-likelihood functional and obtain its functional gradient.
no code implementations • 30 Oct 2016 • Ningyuan Chen, Donald K. K. Lee, Sahand Negahban
Exploiting the fact that most arrival processes exhibit cyclic behaviour, we propose a simple procedure for estimating the intensity of a nonhomogeneous Poisson process.