1 code implementation • 6 Oct 2022 • Le Zhao, Mingcai Chen, Yuntao Du, Haiyang Yang, Chongjun Wang
We design an attention module to capture long-term dependency by mining periodic information in traffic data.
no code implementations • 29 Sep 2021 • Le Zhao, Wei Xu
In practice, discounted episodic return from the training experience or discounted goal return from hindsight relabeling can serve as the value lower bound when the environment is deterministic.