no code implementations • 19 Dec 2021 • Seo Jin Park, Joshua Fried, Sunghyun Kim, Mohammad Alizadeh, Adam Belay
As emerging deep neural network (DNN) models continue to grow in size, using large GPU clusters to train DNNs is becoming an essential requirement to achieving acceptable training times.
no code implementations • 26 Oct 2017 • Seo Jin Park, John Ousterhout
This strategy allows most operations to complete in 1 RTT (the same as an unreplicated system).
Distributed, Parallel, and Cluster Computing Operating Systems