Search Results for author: John Kim

Found 6 papers, 2 papers with code

NeuraChip: Accelerating GNN Computations with a Hash-based Decoupled Spatial Accelerator

1 code implementation • 23 Apr 2024 • Kaustubh Shivdikar, Nicolas Bohm Agostini, Malith Jayaweera, Gilbert Jonatan, Jose L. Abellan, Ajay Joshi, John Kim, David Kaeli

We introduce a rolling eviction strategy to mitigate data idling in on-chip memory as well as address the prevalent issue of memory bloat in sparse graph computations.

Paper
Code

Hera: A Heterogeneity-Aware Multi-Tenant Inference Server for Personalized Recommendations

no code implementations • 23 Feb 2023 • Yujeong Choi, John Kim, Minsoo Rhu

While providing low latency is a fundamental requirement in deploying recommendation services, achieving high resource utility is also crucial in cost-effectively maintaining the datacenter.

Paper
Add Code

Answer Fast: Accelerating BERT on the Tensor Streaming Processor

no code implementations • 22 Jun 2022 • Ibrahim Ahmed, Sahil Parmar, Matthew Boyd, Michael Beidler, Kris Kang, Bill Liu, Kyle Roach, John Kim, Dennis Abts

Transformers have become a predominant machine learning workload, they are not only the de-facto standard for natural language processing tasks, but they are also being deployed in other domains such as vision and speech recognition.

Machine Translation speech-recognition +1

Paper
Add Code

Deep Learning Training in Facebook Data Centers: Design of Scale-up and Scale-out Systems

no code implementations • 20 Mar 2020 • Maxim Naumov, John Kim, Dheevatsa Mudigere, Srinivas Sridharan, Xiaodong Wang, Whitney Zhao, Serhat Yilmaz, Changkyu Kim, Hector Yuen, Mustafa Ozdal, Krishnakumar Nair, Isabel Gao, Bor-Yiing Su, Jiyan Yang, Mikhail Smelyanskiy

Large-scale training is important to ensure high performance and accuracy of machine-learning models.

Distributed, Parallel, and Cluster Computing 68T05, 68M10 H.3.3; I.2.6; C.2.1

Paper
Add Code

NeuMMU: Architectural Support for Efficient Address Translations in Neural Processing Units

no code implementations • 15 Nov 2019 • Bongjoon Hyun, Youngeun Kwon, Yujeong Choi, John Kim, Minsoo Rhu

To satisfy the compute and memory demands of deep neural networks, neural processing units (NPUs) are widely being utilized for accelerating deep learning algorithms.

Management Translation

Paper
Add Code

LYTNet: A Convolutional Neural Network for Real-Time Pedestrian Traffic Lights and Zebra Crossing Recognition for the Visually Impaired

1 code implementation • 23 Jul 2019 • Samuel Yu, Heon Lee, John Kim

LYTNet delivers both of the two most important pieces of information for the visually impaired to cross the road.

Computational Efficiency Navigate

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.