1 code implementation • 23 Apr 2024 • Kaustubh Shivdikar, Nicolas Bohm Agostini, Malith Jayaweera, Gilbert Jonatan, Jose L. Abellan, Ajay Joshi, John Kim, David Kaeli
We introduce a rolling eviction strategy to mitigate data idling in on-chip memory as well as address the prevalent issue of memory bloat in sparse graph computations.
no code implementations • 23 Feb 2023 • Yujeong Choi, John Kim, Minsoo Rhu
While providing low latency is a fundamental requirement in deploying recommendation services, achieving high resource utility is also crucial in cost-effectively maintaining the datacenter.
no code implementations • 22 Jun 2022 • Ibrahim Ahmed, Sahil Parmar, Matthew Boyd, Michael Beidler, Kris Kang, Bill Liu, Kyle Roach, John Kim, Dennis Abts
Transformers have become a predominant machine learning workload, they are not only the de-facto standard for natural language processing tasks, but they are also being deployed in other domains such as vision and speech recognition.
no code implementations • 20 Mar 2020 • Maxim Naumov, John Kim, Dheevatsa Mudigere, Srinivas Sridharan, Xiaodong Wang, Whitney Zhao, Serhat Yilmaz, Changkyu Kim, Hector Yuen, Mustafa Ozdal, Krishnakumar Nair, Isabel Gao, Bor-Yiing Su, Jiyan Yang, Mikhail Smelyanskiy
Large-scale training is important to ensure high performance and accuracy of machine-learning models.
Distributed, Parallel, and Cluster Computing 68T05, 68M10 H.3.3; I.2.6; C.2.1
no code implementations • 15 Nov 2019 • Bongjoon Hyun, Youngeun Kwon, Yujeong Choi, John Kim, Minsoo Rhu
To satisfy the compute and memory demands of deep neural networks, neural processing units (NPUs) are widely being utilized for accelerating deep learning algorithms.
1 code implementation • 23 Jul 2019 • Samuel Yu, Heon Lee, John Kim
LYTNet delivers both of the two most important pieces of information for the visually impaired to cross the road.