no code implementations • 4 Apr 2024 • Jiacai Liu, Wenye Li, Ke Wei
Projected policy gradient under the simplex parameterization, policy gradient and natural policy gradient under the softmax parameterization, are fundamental algorithms in reinforcement learning.
1 code implementation • 16 Feb 2024 • Yiwen Sun, Xianyin Zhang, Shiyu Huang, Shaowei Cai, Bing-Zhen Zhang, Ke Wei
Heuristics are crucial in SAT solvers, while no heuristic rules are suitable for all problem instances.
no code implementations • 2 Jan 2024 • Jie Feng, Ke Wei, Jinchi Chen
Natural policy gradient (NPG) and its variants are widely-used policy search methods in reinforcement learning.
no code implementations • 31 May 2023 • Jiacai Liu, Jinchi Chen, Ke Wei
To show the local linear convergence of the algorithm, we have indeed established the contraction of the sub-optimal probability $b_s^k$ (i. e., the probability of the output policy $\pi^k$ on non-optimal actions) when $k\ge k_0$.
3 code implementations • ICLR 2021 • Zhengyang Geng, Meng-Hao Guo, Hongxu Chen, Xia Li, Ke Wei, Zhouchen Lin
As an essential ingredient of modern deep learning, attention mechanism, especially self-attention, plays a vital role in the global correlation discovery.
Ranked #7 on Semantic Segmentation on PASCAL VOC 2012 test
1 code implementation • 15 Nov 2017 • HanQin Cai, Jian-Feng Cai, Ke Wei
We study robust PCA for the fully observed setting, which is about separating a low rank matrix $\boldsymbol{L}$ and a sparse matrix $\boldsymbol{S}$ from their sum $\boldsymbol{D}=\boldsymbol{L}+\boldsymbol{S}$.
no code implementations • 26 Apr 2017 • Ke Wei, Ke Yin, Xue-Cheng Tai, Tony F. Chan
We propose an effective framework for multi-phase image segmentation and semi-supervised data clustering by introducing a novel region force term into the Potts model.
1 code implementation • 15 Jun 2016 • XiaoDong Li, Shuyang Ling, Thomas Strohmer, Ke Wei
To the best of our knowledge, our algorithm is the first blind deconvolution algorithm that is numerically efficient, robust against noise, and comes with rigorous recovery guarantees under certain subspace conditions.
Information Theory Information Theory