1 code implementation • 5 Dec 2019 • Matthew LeMay, Shijian Li, Tian Guo
Leveraging Perseus, we evaluated the inference throughput and cost for serving various models and demonstrated that multi-tenant model serving led to up to 12% cost reduction.