1 code implementation • 26 Oct 2020 • Ayan Chakrabarti, Roch Guérin, Chenyang Lu, Jiangnan Liu
To deploy machine learning-based algorithms for real-time applications with strict latency constraints, we consider an edge-computing setting where a subset of inputs are offloaded to the edge for processing by an accurate but resource-intensive model, and the rest are processed only by a less-accurate model on the device itself.