An Approximation Algorithm for Optimal Subarchitecture Extraction

16 Oct 2020  ·  Adrian de Wynter ·

We consider the problem of finding the set of architectural parameters for a chosen deep neural network which is optimal under three metrics: parameter size, inference speed, and error rate. In this paper we state the problem formally, and present an approximation algorithm that, for a large subset of instances behaves like an FPTAS with an approximation error of $\rho \leq |{1- \epsilon}|$, and that runs in $O(|{\Xi}| + |{W^*_T}|(1 + |{\Theta}||{B}||{\Xi}|/({\epsilon\, s^{3/2})}))$ steps, where $\epsilon$ and $s$ are input parameters; $|{B}|$ is the batch size; $|{W^*_T}|$ denotes the cardinality of the largest weight set assignment; and $|{\Xi}|$ and $|{\Theta}|$ are the cardinalities of the candidate architecture and hyperparameter spaces, respectively.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here