Efficient Neural Architecture Search via Parameters Sharing

ICML 2018  ·  Hieu Pham, Melody Guan, Barret Zoph, Quoc Le, Jeff Dean ·

We propose Efficient Neural Architecture Search (ENAS), a fast and inexpensive approach for automatic model design. ENAS constructs a large computational graph, where each subgraph represents a neural network architecture, hence forcing all architectures to share their parameters. A controller is trained with policy gradient to search for a subgraph that maximizes the expected reward on a validation set. Meanwhile a model corresponding to the selected subgraph is trained to minimize a canonical cross entropy loss. Sharing parameters among child models allows ENAS to deliver strong empirical performances, whilst using much fewer GPU-hours than existing automatic model design approaches, and notably, 1000x less expensive than standard Neural Architecture Search. On Penn Treebank, ENAS discovers a novel architecture that achieves a test perplexity of 56.3, on par with the existing state-of-the-art among all methods without post-training processing. On CIFAR-10, ENAS finds a novel architecture that achieves 2.89% test error, which is on par with the 2.65% test error of NASNet (Zoph et al., 2018).

PDF Abstract
No code implementations yet. Submit your code now
Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Neural Architecture Search NAS-Bench-201, CIFAR-10 ENAS Accuracy (Test) 54.3 # 32
Accuracy (Val) 39.77 # 29
Search time (s) 13315 # 12
Neural Architecture Search NAS-Bench-201, CIFAR-100 ENAS Accuracy (Test) 15.61 # 33
Accuracy (Val) 15.03 # 31
Search time (s) 13315 # 10
Neural Architecture Search NAS-Bench-201, ImageNet-16-120 ENAS Accuracy (Test) 16.43 # 40
Search time (s) 13315 # 13

Methods