Search Results for author: Jeffrey S. Vetter

Found 4 papers, 0 papers with code

Comparing Llama-2 and GPT-3 LLMs for HPC kernels generation

no code implementations • 12 Sep 2023 • Pedro Valero-Lara, Alexis Huante, Mustafa Al Lail, William F. Godoy, Keita Teranishi, Prasanna Balaprakash, Jeffrey S. Vetter

We evaluate the use of the open-source Llama-2 model for generating well-known, high-performance computing kernels (e. g., AXPY, GEMV, GEMM) on different parallel programming models and languages (e. g., C++: OpenMP, OpenMP Offload, OpenACC, CUDA, HIP; Fortran: OpenMP, OpenMP Offload, OpenACC; Python: numpy, Numba, pyCUDA, cuPy; and Julia: Threads, CUDA. jl, AMDGPU. jl).

Paper
Add Code

On-Sensor Data Filtering using Neuromorphic Computing for High Energy Physics Experiments

no code implementations • 20 Jul 2023 • Shruti R. Kulkarni, Aaron Young, Prasanna Date, Narasinga Rao Miniskar, Jeffrey S. Vetter, Farah Fahim, Benjamin Parpillon, Jennet Dickinson, Nhan Tran, Jieun Yoo, Corrinne Mills, Morris Swartz, Petar Maksimovic, Catherine D. Schuman, Alice Bean

We present our insights on the various system design choices - from data encoding to optimal hyperparameters of the training algorithm - for an accurate and compact SNN optimized for hardware deployment.

Paper
Add Code

Evaluation of OpenAI Codex for HPC Parallel Programming Models Kernel Generation

no code implementations • 27 Jun 2023 • William F. Godoy, Pedro Valero-Lara, Keita Teranishi, Prasanna Balaprakash, Jeffrey S. Vetter

We evaluate AI-assisted generative capabilities on fundamental numerical kernels in high-performance computing (HPC), including AXPY, GEMV, GEMM, SpMV, Jacobi Stencil, and CG.

Paper
Add Code

NVIDIA Tensor Core Programmability, Performance & Precision

no code implementations • 11 Mar 2018 • Stefano Markidis, Steven Wei Der Chien, Erwin Laure, Ivy Bo Peng, Jeffrey S. Vetter

After experimenting with different approaches, we found that NVIDIA Tensor Cores can deliver up to 83 Tflops/s in mixed precision on a Tesla V100 GPU, seven and three times the performance in single and half precision respectively.

Distributed, Parallel, and Cluster Computing Performance

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.