Efficient Neural Architecture Search via Parameters Sharing
We propose Efficient Neural Architecture Search (ENAS), a fast and inexpensive approach for automatic model design. ENAS constructs a large computational graph, where each subgraph represents a neural network architecture, hence forcing all architectures to share their parameters. A controller is trained with policy gradient to search for a subgraph that maximizes the expected reward on a validation set. Meanwhile a model corresponding to the selected subgraph is trained to minimize a canonical cross entropy loss. Sharing parameters among child models allows ENAS to deliver strong empirical performances, whilst using much fewer GPU-hours than existing automatic model design approaches, and notably, 1000x less expensive than standard Neural Architecture Search. On Penn Treebank, ENAS discovers a novel architecture that achieves a test perplexity of 56.3, on par with the existing state-of-the-art among all methods without post-training processing. On CIFAR-10, ENAS finds a novel architecture that achieves 2.89% test error, which is on par with the 2.65% test error of NASNet (Zoph et al., 2018).
PDF AbstractDatasets
Results from the Paper
Task | Dataset | Model | Metric Name | Metric Value | Global Rank | Benchmark |
---|---|---|---|---|---|---|
Neural Architecture Search | NAS-Bench-201, CIFAR-10 | ENAS | Accuracy (Test) | 54.3 | # 32 | |
Accuracy (Val) | 39.77 | # 29 | ||||
Search time (s) | 13315 | # 12 | ||||
Neural Architecture Search | NAS-Bench-201, CIFAR-100 | ENAS | Accuracy (Test) | 15.61 | # 33 | |
Accuracy (Val) | 15.03 | # 31 | ||||
Search time (s) | 13315 | # 10 | ||||
Neural Architecture Search | NAS-Bench-201, ImageNet-16-120 | ENAS | Accuracy (Test) | 16.43 | # 40 | |
Search time (s) | 13315 | # 13 |