Search Results for author: Simon J. Pennycook

Found 2 papers, 1 papers with code

High-Performance Code Generation though Fusion and Vectorization

1 code implementation24 Oct 2017 Jason Sewall, Simon J. Pennycook

We present a technique for automatically transforming kernel-based computations in disparate, nested loops into a fused, vectorized form that can reduce intermediate storage needs and lead to improved performance on contemporary hardware.

Performance Distributed, Parallel, and Cluster Computing

Cannot find the paper you are looking for? You can Submit a new open access paper.