Search Results for author: John D. Owens

Found 8 papers, 3 papers with code

The Sparsity Roofline: Understanding the Hardware Limits of Sparse Neural Networks

no code implementations30 Sep 2023 Cameron Shinn, Collin McCarthy, Saurav Muralidharan, Muhammad Osama, John D. Owens

We achieve this through a novel analytical model for predicting sparse network performance, and validate the predicted speedup using several real-world computer vision architectures pruned across a range of sparsity patterns and degrees.

Benchmarking

Building a Performance Model for Deep Learning Recommendation Model Training on GPUs

no code implementations19 Jan 2022 Zhongyi Lin, Louis Feng, Ehsan K. Ardestani, Jaewon Lee, John Lundell, Changkyu Kim, Arun Kejariwal, John D. Owens

We show that our general performance model not only achieves low prediction error on DLRM, which has highly customized configurations and is dominated by multiple factors but also yields comparable accuracy on other compute-bound ML models targeted by most previous methods.

Fast Gunrock Subgraph Matching (GSM) on GPUs

no code implementations1 Mar 2020 Leyuan Wang, John D. Owens

In this paper, we propose a GPU-efficient subgraph isomorphism algorithm using the Gunrock graph analytic framework, GSM (Gunrock Subgraph Matching), to compute graph matching on GPUs.

Distributed, Parallel, and Cluster Computing

Unsupervised Object Segmentation with Explicit Localization Module

no code implementations21 Nov 2019 Weitang Liu, Lifeng Wei, James Sharpnack, John D. Owens

In this paper, we propose a novel architecture that iteratively discovers and segments out the objects of a scene based on the image reconstruction quality.

Image Reconstruction Object +2

GraphBLAST: A High-Performance Linear Algebra-based Graph Framework on the GPU

1 code implementation4 Aug 2019 Carl Yang, Aydin Buluc, John D. Owens

In this paper, we examine the performance challenges of a linear-algebra-based approach to building graph frameworks and describe new design principles for overcoming these bottlenecks.

Distributed, Parallel, and Cluster Computing Mathematical Software

VoroCrust: Voronoi Meshing Without Clipping

1 code implementation23 Feb 2019 Ahmed Abdelkader, Chandrajit L. Bajaj, Mohamed S. Ebeida, Ahmed H. Mahmoud, Scott A. Mitchell, John D. Owens, Ahmad A. Rushdi

The VoroCrust algorithm is the first provably-correct algorithm for conforming polyhedral Voronoi meshing for non-convex and non-manifold domains with guarantees on the quality of both surface and volume elements.

Graphics Computational Geometry I.3.5

Object Localization with a Weakly Supervised CapsNet

no code implementations20 May 2018 Weitang Liu, Emad Barsoum, John D. Owens

Our model can learn and derive the coordinates of the digits better than its convolution counterpart that lacks a routing-by-agreement algorithm, and can also perform well when testing on the multi-digit moving MNIST and KTH datasets.

Object Object Localization +3

Cannot find the paper you are looking for? You can Submit a new open access paper.