Goten: GPU-Outsourcing Trusted Execution of Neural Network Training and Prediction

25 Sep 2019 · Lucien K.L. Ng, Sherman S.M. Chow, Anna P.Y. Woo, Donald P. H. Wong, Yongjun Zhao ·

Before we saw worldwide collaborative efforts in training machine-learning models or widespread deployments of prediction-as-a-service, we need to devise an efﬁcient privacy-preserving mechanism which guarantees the privacy of all stakeholders (data contributors, model owner, and queriers). Slaom (ICLR ’19) preserves privacy only for prediction by leveraging both trusted environment (e.g., Intel SGX) and untrusted GPU. The challenges for enabling private training are explicitly left open – its pre-computation technique does not hide the model weights and fails to support dynamic quantization corresponding to the large changes in weight magnitudes during training. Moreover, it is not a truly outsourcing solution since (ofﬂine) pre-computation for a job takes as much time as computing the job locally by SGX, i.e., it only works before all pre-computations are exhausted. We propose Goten, a privacy-preserving framework supporting both training and prediction. We tackle all the above challenges by proposing a secure outsourcing protocol which 1) supports dynamic quantization, 2) hides the model weight from GPU, and 3) performs better than a pure-SGX solution even if we perform the precomputation online. Our solution leverages a non-colluding assumption which is often employed by cryptographic solutions aiming for practical efﬁciency (IEEE SP ’13, Usenix Security ’17, PoPETs ’19). We use three servers, which can be reduced to two if the pre-computation is done ofﬂine. Furthermore, we implement our tailor-made memory-aware measures for minimizing the overhead when the SGX memory limit is exceeded (cf., EuroSys ’17, Usenix ATC ’19). Compared to a pure-SGX solution, our experiments show that Goten can speed up linear-layer computations in VGG up to 40×, and overall speed up by 8.64× on VGG11.

PDF Abstract