Search Results for author: Jeffrey S. Vetter

Found 4 papers, 0 papers with code

Comparing Llama-2 and GPT-3 LLMs for HPC kernels generation

no code implementations12 Sep 2023 Pedro Valero-Lara, Alexis Huante, Mustafa Al Lail, William F. Godoy, Keita Teranishi, Prasanna Balaprakash, Jeffrey S. Vetter

We evaluate the use of the open-source Llama-2 model for generating well-known, high-performance computing kernels (e. g., AXPY, GEMV, GEMM) on different parallel programming models and languages (e. g., C++: OpenMP, OpenMP Offload, OpenACC, CUDA, HIP; Fortran: OpenMP, OpenMP Offload, OpenACC; Python: numpy, Numba, pyCUDA, cuPy; and Julia: Threads, CUDA. jl, AMDGPU. jl).

On-Sensor Data Filtering using Neuromorphic Computing for High Energy Physics Experiments

no code implementations20 Jul 2023 Shruti R. Kulkarni, Aaron Young, Prasanna Date, Narasinga Rao Miniskar, Jeffrey S. Vetter, Farah Fahim, Benjamin Parpillon, Jennet Dickinson, Nhan Tran, Jieun Yoo, Corrinne Mills, Morris Swartz, Petar Maksimovic, Catherine D. Schuman, Alice Bean

We present our insights on the various system design choices - from data encoding to optimal hyperparameters of the training algorithm - for an accurate and compact SNN optimized for hardware deployment.

Evaluation of OpenAI Codex for HPC Parallel Programming Models Kernel Generation

no code implementations27 Jun 2023 William F. Godoy, Pedro Valero-Lara, Keita Teranishi, Prasanna Balaprakash, Jeffrey S. Vetter

We evaluate AI-assisted generative capabilities on fundamental numerical kernels in high-performance computing (HPC), including AXPY, GEMV, GEMM, SpMV, Jacobi Stencil, and CG.

NVIDIA Tensor Core Programmability, Performance & Precision

no code implementations11 Mar 2018 Stefano Markidis, Steven Wei Der Chien, Erwin Laure, Ivy Bo Peng, Jeffrey S. Vetter

After experimenting with different approaches, we found that NVIDIA Tensor Cores can deliver up to 83 Tflops/s in mixed precision on a Tesla V100 GPU, seven and three times the performance in single and half precision respectively.

Distributed, Parallel, and Cluster Computing Performance

Cannot find the paper you are looking for? You can Submit a new open access paper.