1 code implementation • 12 Mar 2024 • Zhenfeng He, Yao Shu, Zhongxiang Dai, Bryan Kian Hsiang Low
Nevertheless, the estimation ability of these metrics typically varies across different tasks, making it challenging to achieve robust and consistently good search performance on diverse tasks with only a single training-free metric.
no code implementations • 5 Mar 2024 • Wenyang Hu, Yao Shu, Zongmin Yu, Zhaoxuan Wu, Xiangqiang Lin, Zhongxiang Dai, See-Kiong Ng, Bryan Kian Hsiang Low
Existing methodologies usually prioritize a global optimization for finding the global optimum, which however will perform poorly in certain tasks.
no code implementations • 18 Feb 2024 • Yao Shu, Jiongfeng Fang, Ying Tiffany He, Fei Richard Yu
First-order optimization (FOO) algorithms are pivotal in numerous computational domains such as machine learning and signal denoising.
1 code implementation • 2 Oct 2023 • Xiaoqiang Lin, Zhaoxuan Wu, Zhongxiang Dai, Wenyang Hu, Yao Shu, See-Kiong Ng, Patrick Jaillet, Bryan Kian Hsiang Low
We perform instruction optimization for ChatGPT and use extensive experiments to show that our INSTINCT consistently outperforms the existing methods in different tasks, such as in various instruction induction tasks and the task of improving the zero-shot chain-of-thought instruction.
1 code implementation • 8 Aug 2023 • Yao Shu, Xiaoqiang Lin, Zhongxiang Dai, Bryan Kian Hsiang Low
To this end, we (a) introduce trajectory-informed gradient surrogates which is able to use the history of function queries during optimization for accurate and query-efficient gradient estimation, and (b) develop the technique of adaptive gradient correction using these gradient surrogates to mitigate the aforementioned disparity.
1 code implementation • 13 Oct 2022 • Zhongxiang Dai, Yao Shu, Bryan Kian Hsiang Low, Patrick Jaillet
linear model), which is equivalently sampled from the GP posterior with the NTK as the kernel function.
1 code implementation • 28 May 2022 • Zhongxiang Dai, Yao Shu, Arun Verma, Flint Xiaofeng Fan, Bryan Kian Hsiang Low, Patrick Jaillet
To better exploit the federated setting, FN-UCB adopts a weighted combination of two UCBs: $\text{UCB}^{a}$ allows every agent to additionally use the observations from the other agents to accelerate exploration (without sharing raw observations), while $\text{UCB}^{b}$ uses an NN with aggregated parameters for reward prediction in a similar way to federated averaging for supervised learning.
1 code implementation • 24 Jan 2022 • Yao Shu, Zhongxiang Dai, Zhaoxuan Wu, Bryan Kian Hsiang Low
As a consequence, (a) the relationships among these metrics are unclear, (b) there is no theoretical interpretation for their empirical performances, and (c) there may exist untapped potential in existing training-free NAS, which probably can be unveiled through a unified theoretical understanding.
no code implementations • 6 Sep 2021 • Yao Shu, Yizhou Chen, Zhongxiang Dai, Bryan Kian Hsiang Low
Unfortunately, these NAS algorithms aim to select only one single well-performing architecture from their search spaces and thus have overlooked the capability of neural network ensemble (i. e., an ensemble of neural networks with diverse architectures) in achieving improved performance over a single final selected architecture.
no code implementations • ICLR 2022 • Yao Shu, Shaofeng Cai, Zhongxiang Dai, Beng Chin Ooi, Bryan Kian Hsiang Low
Recent years have witnessed a surging interest in Neural Architecture Search (NAS).
no code implementations • 17 Oct 2020 • Min Zhang, Yao Shu, Kun He
Finite-sum optimization plays an important role in the area of machine learning, and hence has triggered a surge of interest in recent years.
no code implementations • ICLR 2020 • Shaofeng Cai, Yao Shu, Wei Wang, Gang Chen, Beng Chin Ooi
Recent years have witnessed growing interests in designing efficient neural networks and neural architecture search (NAS).
1 code implementation • ICLR 2020 • Yao Shu, Wei Wang, Shaofeng Cai
Neural architecture search (NAS) searches architectures automatically for given tasks, e. g., image classification and language modeling.
no code implementations • 13 May 2019 • Shaofeng Cai, Yao Shu, Wei Wang, Beng Chin Ooi
The deployment of deep neural networks in real-world applications is mostly restricted by their high inference costs.
no code implementations • 6 Apr 2019 • Shaofeng Cai, Yao Shu, Gang Chen, Beng Chin Ooi, Wei Wang, Meihui Zhang
However, many recent works show that the standard dropout is ineffective or even detrimental to the training of CNNs.
no code implementations • 19 Feb 2019 • Junzhe Zhang, Sai Ho Yeung, Yao Shu, Bingsheng He, Wei Wang
They are achieved by exploiting the iterative nature of the training algorithm of deep learning to derive the lifetime and read/write order of all variables.
no code implementations • 2 Apr 2017 • Kun He, Jingbo Wang, Haochuan Li, Yao Shu, Mengxiao Zhang, Man Zhu, Li-Wei Wang, John E. Hopcroft
Toward a deeper understanding on the inner work of deep neural networks, we investigate CNN (convolutional neural network) using DCN (deconvolutional network) and randomization technique, and gain new insights for the intrinsic property of this network architecture.