Search Results for author: Yunseong Kim

Found 2 papers, 0 papers with code

PARIS and ELSA: An Elastic Scheduling Algorithm for Reconfigurable Multi-GPU Inference Servers

no code implementations27 Feb 2022 Yunseong Kim, Yujeong Choi, Minsoo Rhu

However, maximizing server utilization and system throughput is also crucial for ML service providers as it helps lower the total-cost-of-ownership.

Scheduling

LazyBatching: An SLA-aware Batching System for Cloud Machine Learning Inference

no code implementations25 Oct 2020 Yujeong Choi, Yunseong Kim, Minsoo Rhu

In cloud ML inference systems, batching is an essential technique to increase throughput which helps optimize total-cost-of-ownership.

BIG-bench Machine Learning Scheduling

Cannot find the paper you are looking for? You can Submit a new open access paper.