Search Results for author: Ting Han Wei

Found 7 papers, 3 papers with code

Game Solving with Online Fine-Tuning

1 code implementation • NeurIPS 2023 • Ti-Rong Wu, Hung Guei, Ting Han Wei, Chung-Chin Shih, Jui-Te Chin, I-Chen Wu

Solving a game typically means to find the game-theoretic value (outcome given optimal play), and optionally a full strategy to follow in order to achieve that outcome.

Board Games

Paper
Code

MiniZero: Comparative Analysis of AlphaZero and MuZero on Go, Othello, and Atari Games

1 code implementation • 17 Oct 2023 • Ti-Rong Wu, Hung Guei, Po-Wei Huang, Pei-Chiun Peng, Ting Han Wei, Chung-Chin Shih, Yun-Jui Tsai

This paper presents MiniZero, a zero-knowledge learning framework that supports four state-of-the-art algorithms, including AlphaZero, MuZero, Gumbel AlphaZero, and Gumbel MuZero.

Atari Games Board Games

Paper
Code

A Local-Pattern Related Look-Up Table

no code implementations • 22 Dec 2022 • Chung-Chin Shih, Ting Han Wei, Ti-Rong Wu, I-Chen Wu

Experiments also show that the use of an RZT instead of a traditional transposition table significantly reduces the number of searched nodes on two data sets of 7x7 and 19x19 L&D Go problems.

Paper
Add Code

A Novel Approach to Solving Goal-Achieving Problems for Board Games

no code implementations • 5 Dec 2021 • Chung-Chin Shih, Ti-Rong Wu, Ting Han Wei, I-Chen Wu

This paper first proposes a novel RZ-based approach, called the RZ-Based Search (RZS), to solving L&D problems for Go.

Board Games

Paper
Add Code

AlphaZero-based Proof Cost Network to Aid Game Solving

1 code implementation • ICLR 2022 • Ti-Rong Wu, Chung-Chin Shih, Ting Han Wei, Meng-Yu Tsai, Wei-Yuan Hsu, I-Chen Wu

We train a Proof Cost Network (PCN), where proof cost is a heuristic that estimates the amount of work required to solve problems.

Board Games

Paper
Code

Rethinking Deep Policy Gradients via State-Wise Policy Improvement

no code implementations • NeurIPS Workshop ICBINB 2020 • Kai-Chun Hu, Ping-Chun Hsieh, Ting Han Wei, I-Chen Wu

Deep policy gradient is one of the major frameworks in reinforcement learning, and it has been shown to improve parameterized policies across various tasks and environments.

Policy Gradient Methods Value prediction

Paper
Add Code

Towards Combining On-Off-Policy Methods for Real-World Applications

no code implementations • 24 Apr 2019 • Kai-Chun Hu, Chen-Huan Pi, Ting Han Wei, I-Chen Wu, Stone Cheng, Yi-Wei Dai, Wei-Yuan Ye

In this paper, we point out a fundamental property of the objective in reinforcement learning, with which we can reformulate the policy gradient objective into a perceptron-like loss function, removing the need to distinguish between on and off policy training.

OpenAI Gym Position

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.