Search Results for author: Yao Shu

Found 17 papers, 7 papers with code

Robustifying and Boosting Training-Free Neural Architecture Search

1 code implementation • 12 Mar 2024 • Zhenfeng He, Yao Shu, Zhongxiang Dai, Bryan Kian Hsiang Low

Nevertheless, the estimation ability of these metrics typically varies across different tasks, making it challenging to achieve robust and consistently good search performance on diverse tasks with only a single training-free metric.

Bayesian Optimization Neural Architecture Search

Paper
Code

Localized Zeroth-Order Prompt Optimization

no code implementations • 5 Mar 2024 • Wenyang Hu, Yao Shu, Zongmin Yu, Zhaoxuan Wu, Xiangqiang Lin, Zhongxiang Dai, See-Kiong Ng, Bryan Kian Hsiang Low

Existing methodologies usually prioritize a global optimization for finding the global optimum, which however will perform poorly in certain tasks.

Paper
Add Code

OptEx: Expediting First-Order Optimization with Approximately Parallelized Iterations

no code implementations • 18 Feb 2024 • Yao Shu, Jiongfeng Fang, Ying Tiffany He, Fei Richard Yu

First-order optimization (FOO) algorithms are pivotal in numerous computational domains such as machine learning and signal denoising.

Denoising

Paper
Add Code

Use Your INSTINCT: INSTruction optimization usIng Neural bandits Coupled with Transformers

1 code implementation • 2 Oct 2023 • Xiaoqiang Lin, Zhaoxuan Wu, Zhongxiang Dai, Wenyang Hu, Yao Shu, See-Kiong Ng, Patrick Jaillet, Bryan Kian Hsiang Low

We perform instruction optimization for ChatGPT and use extensive experiments to show that our INSTINCT consistently outperforms the existing methods in different tasks, such as in various instruction induction tasks and the task of improving the zero-shot chain-of-thought instruction.

Bayesian Optimization Instruction Following

Paper
Code

Federated Zeroth-Order Optimization using Trajectory-Informed Surrogate Gradients

1 code implementation • 8 Aug 2023 • Yao Shu, Xiaoqiang Lin, Zhongxiang Dai, Bryan Kian Hsiang Low

To this end, we (a) introduce trajectory-informed gradient surrogates which is able to use the history of function queries during optimization for accurate and query-efficient gradient estimation, and (b) develop the technique of adaptive gradient correction using these gradient surrogates to mitigate the aforementioned disparity.

Adversarial Attack Federated Learning

Paper
Code

Sample-Then-Optimize Batch Neural Thompson Sampling

1 code implementation • 13 Oct 2022 • Zhongxiang Dai, Yao Shu, Bryan Kian Hsiang Low, Patrick Jaillet

linear model), which is equivalently sampled from the GP posterior with the NTK as the kernel function.

AutoML Bayesian Optimization +1

Paper
Code

Federated Neural Bandits

1 code implementation • 28 May 2022 • Zhongxiang Dai, Yao Shu, Arun Verma, Flint Xiaofeng Fan, Bryan Kian Hsiang Low, Patrick Jaillet

To better exploit the federated setting, FN-UCB adopts a weighted combination of two UCBs: $\text{UCB}^{a}$ allows every agent to additionally use the observations from the other agents to accelerate exploration (without sharing raw observations), while $\text{UCB}^{b}$ uses an NN with aggregated parameters for reward prediction in a similar way to federated averaging for supervised learning.

Multi-Armed Bandits

Paper
Code

Unifying and Boosting Gradient-Based Training-Free Neural Architecture Search

1 code implementation • 24 Jan 2022 • Yao Shu, Zhongxiang Dai, Zhaoxuan Wu, Bryan Kian Hsiang Low

As a consequence, (a) the relationships among these metrics are unclear, (b) there is no theoretical interpretation for their empirical performances, and (c) there may exist untapped potential in existing training-free NAS, which probably can be unveiled through a unified theoretical understanding.

Neural Architecture Search

Paper
Code

Neural Ensemble Search via Bayesian Sampling

no code implementations • 6 Sep 2021 • Yao Shu, Yizhou Chen, Zhongxiang Dai, Bryan Kian Hsiang Low

Unfortunately, these NAS algorithms aim to select only one single well-performing architecture from their search spaces and thus have overlooked the capability of neural network ensemble (i. e., an ensemble of neural networks with diverse architectures) in achieving improved performance over a single final selected architecture.

Adversarial Defense Neural Architecture Search

Paper
Add Code

NASI: Label- and Data-agnostic Neural Architecture Search at Initialization

no code implementations • ICLR 2022 • Yao Shu, Shaofeng Cai, Zhongxiang Dai, Beng Chin Ooi, Bryan Kian Hsiang Low

Recent years have witnessed a surging interest in Neural Architecture Search (NAS).

Ranked #4 on Neural Architecture Search on NATS-Bench Topology, ImageNet16-120

Neural Architecture Search

Paper
Add Code

Tight Lower Complexity Bounds for Strongly Convex Finite-Sum Optimization

no code implementations • 17 Oct 2020 • Min Zhang, Yao Shu, Kun He

Finite-sum optimization plays an important role in the area of machine learning, and hence has triggered a surge of interest in recent years.

Paper
Add Code

ISBNet: Instance-aware Selective Branching Networks

no code implementations • ICLR 2020 • Shaofeng Cai, Yao Shu, Wei Wang, Gang Chen, Beng Chin Ooi

Recent years have witnessed growing interests in designing efficient neural networks and neural architecture search (NAS).

Neural Architecture Search

Paper
Add Code

Understanding Architectures Learnt by Cell-based Neural Architecture Search

1 code implementation • ICLR 2020 • Yao Shu, Wei Wang, Shaofeng Cai

Neural architecture search (NAS) searches architectures automatically for given tasks, e. g., image classification and language modeling.

Image Classification Language Modelling +1

Paper
Code

Dynamic Routing Networks

no code implementations • 13 May 2019 • Shaofeng Cai, Yao Shu, Wei Wang, Beng Chin Ooi

The deployment of deep neural networks in real-world applications is mostly restricted by their high inference costs.

Neural Architecture Search

Paper
Add Code

Effective and Efficient Dropout for Deep Convolutional Neural Networks

no code implementations • 6 Apr 2019 • Shaofeng Cai, Yao Shu, Gang Chen, Beng Chin Ooi, Wei Wang, Meihui Zhang

However, many recent works show that the standard dropout is ineffective or even detrimental to the training of CNNs.

Attribute

Paper
Add Code

Efficient Memory Management for GPU-based Deep Learning Systems

no code implementations • 19 Feb 2019 • Junzhe Zhang, Sai Ho Yeung, Yao Shu, Bingsheng He, Wei Wang

They are achieved by exploiting the iterative nature of the training algorithm of deep learning to derive the lifetime and read/write order of all variables.

Management Model Compression

Paper
Add Code

Randomness in Deconvolutional Networks for Visual Representation

no code implementations • 2 Apr 2017 • Kun He, Jingbo Wang, Haochuan Li, Yao Shu, Mengxiao Zhang, Man Zhu, Li-Wei Wang, John E. Hopcroft

Toward a deeper understanding on the inner work of deep neural networks, we investigate CNN (convolutional neural network) using DCN (deconvolutional network) and randomization technique, and gain new insights for the intrinsic property of this network architecture.

General Classification Image Reconstruction

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.