2 code implementations • 9 Jan 2024 • Thomas Randall, Jaehoon Koo, Brice Videau, Michael Kruse, Xingfu Wu, Paul Hovland, Mary Hall, Rong Ge, Prasanna Balaprakash
We introduce the first generative TL-based autotuning approach based on the Gaussian copula (GC) to model the high-performing regions of the search space from prior data and then generate high-performing configurations for new tasks.
1 code implementation • 13 Sep 2023 • Xingfu Wu, Praveen Paramasivam, Valerie Taylor
Apache TVM (Tensor Virtual Machine), an open source machine learning compiler framework designed to optimize computations across various hardware platforms, provides an opportunity to improve the performance of dense matrix factorizations such as LU (Lower Upper) decomposition and Cholesky decomposition on GPUs and AI (Artificial Intelligence) accelerators.
1 code implementation • 28 Mar 2023 • Xingfu Wu, Prasanna Balaprakash, Michael Kruse, Jaehoon Koo, Brice Videau, Paul Hovland, Valerie Taylor, Brad Geltz, Siddhartha Jana, Mary Hall
As we enter the exascale computing era, efficiently utilizing power and optimizing the performance of scientific applications under power and energy constraints has become critical and challenging.
no code implementations • 10 May 2021 • Jaehoon Koo, Prasanna Balaprakash, Michael Kruse, Xingfu Wu, Paul Hovland, Mary Hall
The search space exposed by the transformation pragmas is a tree, wherein each node represents a specific combination of loop transformations that can be applied to the code resulting from the parent node's loop transformations.
1 code implementation • 27 Apr 2021 • Xingfu Wu, Michael Kruse, Prasanna Balaprakash, Hal Finkel, Paul Hovland, Valerie Taylor, Mary Hall
In this paper, we develop a ytopt autotuning framework that leverages Bayesian optimization to explore the parameter space search and compare four different supervised learning methods within Bayesian optimization and evaluate their effectiveness.
no code implementations • 12 Nov 2020 • Xingfu Wu, Valerie Taylor, Zhiling Lan
In this paper, we use modeling and prediction tool MuMMI (Multiple Metrics Modeling Infrastructure) and ten machine learning methods to model and predict performance and power and compare their prediction error rates.
no code implementations • 12 Nov 2020 • Xingfu Wu, Valerie Taylor
In this paper, we utilize ensemble learning to combine linear, nonlinear, and tree-/rule-based ML methods to cope with the bias-variance tradeoff and result in more accurate models.
no code implementations • 15 Oct 2020 • Xingfu Wu, Michael Kruse, Prasanna Balaprakash, Hal Finkel, Paul Hovland, Valerie Taylor, Mary Hall
An autotuning is an approach that explores a search space of possible implementations/configurations of a kernel or an application by selecting and evaluating a subset of implementations/configurations on a target platform and/or use models to identify a high performance implementation/configuration.
no code implementations • 13 Oct 2020 • Michael Kruse, Hal Finkel, Xingfu Wu
In this paper we propose a loop transformation search space that takes the form of a tree, in contrast to previous approaches that usually use vector spaces to represent loop optimization configurations.