no code implementations • CCL 2020 • Xu Zhuopeng, Li Xia, Li Yinlin, Wang Zihan, Fanxu Yujie, Lai Xiaoyan
Legal Judgement Prediction has attracted more and more attention in recent years.
no code implementations • CCL 2022 • He Junyi, Zhuang Junbin, Li Xia
“Grammatical error correction (GEC) aims at correcting texts with different types of grammatical errors into natural and correct forms.
no code implementations • CCL 2022 • Zheng Yangjia, Li Xia, Ma Junteng, Chen Yuan
In practice, news or prices of a stock in one day are normally impacted by different days with different weights, and they can influence each other.
no code implementations • 27 Feb 2023 • Li Xia, Shuai Ma
In this paper, we propose a new approach to find the globally optimal policy for combined metrics of steady-state mean and variance in an infinite-horizon undiscounted MDP.
no code implementations • 10 Dec 2022 • Shuhua Xiao, Jiali Ma, Li Xia, Shushang Zhu
In this paper, we regard the issue of the optimal bailout (capital injection) as a black-box optimization problem, where the black box is characterized as a fixed-point system that follows the E-N framework for measuring the systemic risk of the financial system.
no code implementations • 17 Oct 2022 • Li Xia, Peter W. Glynn
CVaR (Conditional Value at Risk) is a risk metric widely used in finance.
no code implementations • 14 Sep 2022 • Xiaoteng Ma, Zhipeng Liang, Jose Blanchet, Mingwen Liu, Li Xia, Jiheng Zhang, Qianchuan Zhao, Zhengyuan Zhou
Among the reasons hindering reinforcement learning (RL) applications to real-world problems, two factors are critical: limited data and the mismatch between the testing environment (real environment in which the policy is deployed) and the training environment (e. g., a simulator).
no code implementations • 15 Jun 2022 • Xiaoteng Ma, Shuai Ma, Li Xia, Qianchuan Zhao
Keeping risk under control is often more crucial than maximizing expected rewards in real-world decision-making situations, such as finance, robotics, autonomous driving, etc.
no code implementations • 15 Jan 2022 • Shuai Ma, Xiaoteng Ma, Li Xia
To deal with this unorthodox problem, we introduce a pseudo mean to transform the untreatable MDP to a standard one with a redefined reward function in standard form and derive a discounted mean-variance performance difference formula.
no code implementations • 7 Jun 2021 • Xiaoteng Ma, Xiaohang Tang, Li Xia, Jun Yang, Qianchuan Zhao
Our work provides a unified framework of the trust region approach including both the discounted and average criteria, which may complement the framework of reinforcement learning beyond the discounted objectives.
no code implementations • 9 Aug 2020 • Li Xia
This paper investigates the optimization problem of an infinite stage discrete time Markov decision process (MDP) with a long-run average metric considering both mean and variance of rewards together.
no code implementations • 25 Jun 2020 • Chenghao Li, Xiaoteng Ma, Chongjie Zhang, Jun Yang, Li Xia, Qianchuan Zhao
In these tasks, our approach learns a diverse set of options, each of whose state-action space has strong coherence.
2 code implementations • 20 Jun 2020 • Jui-Ting Huang, ASHISH SHARMA, Shuying Sun, Li Xia, David Zhang, Philip Pronin, Janani Padmanabhan, Giuseppe Ottaviano, Linjun Yang
In this paper, we discuss the techniques for applying EBR to a Facebook Search system.
1 code implementation • 5 Jun 2020 • Ming Zhang, Yawei Wang, Xiaoteng Ma, Li Xia, Jun Yang, Zhiheng Li, Xiu Li
The generative adversarial imitation learning (GAIL) has provided an adversarial learning framework for imitating expert policy from demonstrations in high-dimensional continuous tasks.
no code implementations • 30 Apr 2020 • Xiaoteng Ma, Li Xia, Zhengyuan Zhou, Jun Yang, Qianchuan Zhao
In this paper, we present a new reinforcement learning (RL) algorithm called Distributional Soft Actor Critic (DSAC), which exploits the distributional information of accumulated rewards to achieve better performance.