no code implementations • 20 May 2024 • Taiyuan Mei, Yun Zi, Xiaohan Cheng, Zijun Gao, Qi Wang, Haowei Yang
The internal structure and operation mechanism of large-scale language models are analyzed theoretically, especially how Transformer and its derivative architectures can restrict computing efficiency while capturing long-term dependencies.
no code implementations • 16 Oct 2023 • Xinxin Yan, Zhideng Zhou, Xiaohan Cheng, Xiaolei Yang
Compared to the traditional methods, the learned unconstrained and semi-constrained schemes significantly reduce the prediction error on coarse grids.
no code implementations • 1 Dec 2020 • Xiaohan Cheng
A model based on Sarsa which is a kind of reinforcement learning algorithm is proposed to automatically generate a set of strategies to reduce the employee turnover rate.