Search Results for author: Michael E. Papka

Found 7 papers, 2 papers with code

Interpretable Modeling of Deep Reinforcement Learning Driven Scheduling

no code implementations • 24 Mar 2024 • Boyang Li, Zhiling Lan, Michael E. Papka

In this work, we present a framework called IRL (Interpretable Reinforcement Learning) to address the issue of interpretability of DRL scheduling.

Imitation Learning reinforcement-learning +1

Paper
Add Code

DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery through Sophisticated AI System Technologies

no code implementations • 6 Oct 2023 • Shuaiwen Leon Song, Bonnie Kruft, Minjia Zhang, Conglong Li, Shiyang Chen, Chengming Zhang, Masahiro Tanaka, Xiaoxia Wu, Jeff Rasley, Ammar Ahmad Awan, Connor Holmes, Martin Cai, Adam Ghanem, Zhongzhu Zhou, Yuxiong He, Pete Luferenko, Divya Kumar, Jonathan Weyn, Ruixiong Zhang, Sylwester Klocek, Volodymyr Vragov, Mohammed AlQuraishi, Gustaf Ahdritz, Christina Floristean, Cristina Negri, Rao Kotamarthi, Venkatram Vishwanath, Arvind Ramanathan, Sam Foreman, Kyle Hippe, Troy Arcomano, Romit Maulik, Maxim Zvyagin, Alexander Brace, Bin Zhang, Cindy Orozco Bohorquez, Austin Clyde, Bharat Kale, Danilo Perez-Rivera, Heng Ma, Carla M. Mann, Michael Irvin, J. Gregory Pauloski, Logan Ward, Valerie Hayot, Murali Emani, Zhen Xie, Diangen Lin, Maulik Shukla, Ian Foster, James J. Davis, Michael E. Papka, Thomas Brettin, Prasanna Balaprakash, Gina Tourassi, John Gounley, Heidi Hanson, Thomas E Potok, Massimiliano Lupo Pasini, Kate Evans, Dan Lu, Dalton Lunga, Junqi Yin, Sajal Dash, Feiyi Wang, Mallikarjun Shankar, Isaac Lyngaas, Xiao Wang, Guojing Cong, Pei Zhang, Ming Fan, Siyan Liu, Adolfy Hoisie, Shinjae Yoo, Yihui Ren, William Tang, Kyle Felker, Alexey Svyatkovskiy, Hang Liu, Ashwin Aji, Angela Dalton, Michael Schulte, Karl Schulz, Yuntian Deng, Weili Nie, Josh Romero, Christian Dallago, Arash Vahdat, Chaowei Xiao, Thomas Gibbs, Anima Anandkumar, Rick Stevens

In the upcoming decade, deep learning may revolutionize the natural sciences, enhancing our capacity to model and predict natural occurrences.

Paper
Add Code

A Comprehensive Performance Study of Large Language Models on Novel AI Accelerators

no code implementations • 6 Oct 2023 • Murali Emani, Sam Foreman, Varuni Sastry, Zhen Xie, Siddhisanket Raskar, William Arnold, Rajeev Thakur, Venkatram Vishwanath, Michael E. Papka

However, the comparative performance of these AI accelerators on large language models has not been previously studied.

Paper
Add Code

A Multi-Level, Multi-Scale Visual Analytics Approach to Assessment of Multifidelity HPC Systems

no code implementations • 15 Jun 2023 • Shilpika, Bethany Lusch, Murali Emani, Filippo Simini, Venkatram Vishwanath, Michael E. Papka, Kwan-Liu Ma

This end-to-end log analysis system, coupled with visual analytics support, allows users to glean and promptly extract supercomputer usage and error patterns at varying temporal and spatial resolutions.

Paper
Add Code

Distributed Neural Representation for Reactive in situ Visualization

no code implementations • 28 Mar 2023 • Qi Wu, Joseph A. Insley, Victor A. Mateevitsi, Silvio Rizzi, Michael E. Papka, Kwan-Liu Ma

In this work, we develop an implicit neural representation for distributed volume data and incorporate it into the DIVA reactive programming system.

Paper
Add Code

BFTrainer: Low-Cost Training of Neural Networks on Unfillable Supercomputer Nodes

3 code implementations • 22 Jun 2021 • Zhengchun Liu, Rajkumar Kettimuthu, Michael E. Papka, Ian Foster

We describe how the task of rescaling suitable DNN training tasks to fit dynamically changing holes can be formulated as a deterministic mixed integer linear programming (MILP)-based resource allocation algorithm, and show that this MILP problem can be solved efficiently at run time.

Scheduling

Paper
Code

Deep Reinforcement Agent for Scheduling in HPC

1 code implementation • 11 Feb 2021 • Yuping Fan, Zhiling Lan, Taylor Childers, Paul Rich, William Allcock, Michael E. Papka

Existing cluster scheduling heuristics are developed by human experts based on their experience with specific HPC systems and workloads.

Scheduling

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.