1 code implementation • 29 Feb 2024 • Zhiyu An, Xianzhong Ding, Wan Du
We found that the high dimensionality of the thermal dynamics model input hinders the efficiency of policy extraction.
no code implementations • 20 Feb 2024 • Zhiyu An, Xianzhong Ding, Wan Du
Recent years have seen an emerging interest in the trustworthiness of machine learning-based agents in the wild, especially in robotics, to provide safety assurance for the industry.
1 code implementation • 10 Nov 2023 • Yifei Xu, Yuning Chen, Xumiao Zhang, Xianshang Lin, Pan Hu, Yunfei Ma, Songwu Lu, Wan Du, Zhuoqing Mao, Ennan Zhai, Dennis Cai
We develop the CloudEval-YAML benchmark with practicality in mind: the dataset consists of hand-written problems with unit tests targeting practical scenarios.
no code implementations • 3 Oct 2023 • Xianzhong Ding, Le Chen, Murali Emani, Chunhua Liao, Pei-Hung Lin, Tristan Vanderbruggen, Zhen Xie, Alberto E. Cerpa, Wan Du
Large Language Models (LLMs), including the LLaMA model, have exhibited their efficacy across various general-domain natural language processing (NLP) tasks.
no code implementations • 4 Apr 2023 • Xianzhong Ding, Wan Du
The system employs a neural network, known as the DRL control agent, which learns an optimal control policy that considers both the current soil moisture measurement and the future soil moisture loss.
no code implementations • 1 Feb 2023 • Xianzhong Ding, Alberto Cerpa, Wan Du
In this paper, we conduct a set of experiments to analyze the limitations of current MBRL-based HVAC control methods, in terms of model uncertainty and controller effectiveness.
no code implementations • 27 Jan 2023 • Xianzhong Ding, Alberto Cerpa, Wan Du
The DRL architecture includes a novel reward function that allows the framework to explore the tradeoffs between energy use and users' comfort, while at the same time enabling the solution of the high-dimensional control problem due to the interactions of four different building subsystems.
no code implementations • 29 Sep 2021 • Kunpeng Liu, Pengfei Wang, Dongjie Wang, Wan Du, Dapeng Oliver Wu, Yanjie Fu
In this paper, we propose a single-agent Monte Carlo based reinforced feature selection (MCRFS) method, as well as two efficiency improvement strategies, i. e., early stopping (ES) strategy and reward-level interactive (RI) strategy.
no code implementations • 20 Jul 2019 • Zhihao Shen, Wan Du, Xi Zhao, Jianhua Zou
Retrieving similar trajectories from a large trajectory dataset is important for a variety of applications, like transportation planning and mobility analysis.
no code implementations • 10 Jun 2019 • Jie Liu, Jiawen Liu, Wan Du, Dong Li
In this paper, we perform a variety of experiments on a representative mobile device (the NVIDIA TX2) to study the performance of training deep learning models.