Search Results for author: Christina Giannoula

Found 4 papers, 1 papers with code

Accelerating Graph Neural Networks on Real Processing-In-Memory Systems

no code implementations26 Feb 2024 Christina Giannoula, Peiming Yang, Ivan Fernandez Vega, Jiacheng Yang, Yu Xin Li, Juan Gomez Luna, Mohammad Sadrosadati, Onur Mutlu, Gennady Pekhimenko

Graph Neural Network (GNN) execution involves both compute-intensive and memory-intensive kernels, the latter dominates the total time, being significantly bottlenecked by data movement between memory and processors.

Minuet: Accelerating 3D Sparse Convolutions on GPUs

1 code implementation1 Dec 2023 Jiacheng Yang, Christina Giannoula, Jun Wu, Mostafa Elhoushi, James Gleeson, Gennady Pekhimenko

Minuet proposes to (i) replace the hash tables used in the Map step with a novel segmented sorting double-traversed binary search algorithm that highly utilizes the on-chip memory hierarchy of GPUs, (ii) use a lightweight scheme to autotune the tile size in the Gather and Scatter operations of the GMaS step, such that to adapt the execution to the particular characteristics of each SC layer, dataset, and GPU architecture, and (iii) employ a padding-efficient GEMM grouping approach that reduces both memory padding and kernel launching overheads.

The Synergy of Speculative Decoding and Batching in Serving Large Language Models

no code implementations28 Oct 2023 Qidong Su, Christina Giannoula, Gennady Pekhimenko

Large Language Models (LLMs) like GPT are state-of-the-art text generation models that provide significant assistance in daily routines.

Text Generation

Cannot find the paper you are looking for? You can Submit a new open access paper.