no code implementations • 22 Dec 2023 • Nikolaos Louloudakis, Perry Gibson, José Cano, Ajitha Rajan
Converting deep learning models between frameworks is a common step to maximize model compatibility across devices and leverage optimization features that may be exclusively provided in one deep learning framework.
no code implementations • 15 Nov 2023 • Perry Gibson, José Cano, Elliot J. Crowley, Amos Storkey, Michael O'Boyle
Deep Neural Networks (DNNs) are extremely computationally demanding, which presents a large barrier to their deployment on resource-constrained devices.
no code implementations • 10 Jun 2023 • Nikolaos Louloudakis, Perry Gibson, José Cano, Ajitha Rajan
To mitigate such errors, we present a novel approach towards fault localization and repair of buggy deep learning framework conversions, focusing on pre-trained image recognition models.
1 code implementation • 5 Jun 2023 • Nikolaos Louloudakis, Perry Gibson, José Cano, Ajitha Rajan
Owing to the increased use of image recognition tasks in safety-critical applications like autonomous driving and medical imaging, it is imperative to assess their robustness to changes in the computational environment, as the impact of parameters like deep learning frameworks, compiler optimizations, and hardware devices on model performance and correctness is not yet well understood.
1 code implementation • 2 Jun 2023 • Nikolaos Louloudakis, Perry Gibson, José Cano, Ajitha Rajan
On top of that, AI methods such as Deep Neural Networks (DNNs) are utilized to perform demanding, resource-intensive and even safety-critical tasks, and in order to effectively increase the performance of the DNN models deployed, a variety of Machine Learning (ML) compilers have been developed, allowing compatibility of DNNs with a variety of hardware acceleration devices, such as GPUs and TPUs.
no code implementations • 1 Nov 2022 • Nikolaos Louloudakis, Perry Gibson, José Cano, Ajitha Rajan
On the other hand, model inference time was affected by all environment parameters with changes in hardware device having the most effect.
no code implementations • 19 Jun 2022 • Perry Gibson, José Cano
As Deep Neural Networks (DNNs) have become an increasingly ubiquitous workload, the range of libraries and tooling available to aid in their development and deployment has grown significantly.
1 code implementation • 26 Apr 2022 • Axel Stjerngren, Perry Gibson, José Cano
Reconfigurable accelerators for deep neural networks (DNNs) promise to improve performance such as inference latency.
1 code implementation • 14 Jan 2022 • Perry Gibson, José Cano
Auto-scheduling for tensor programs is a process where a search algorithm automatically explores candidate schedules (program transformations) for a given program on a target hardware platform to improve its performance.
1 code implementation • 1 Oct 2021 • Jude Haris, Perry Gibson, José Cano, Nicolas Bohm Agostini, David Kaeli
In this paper we propose SECDA, a new hardware/software co-design methodology to reduce design time of optimized DNN inference accelerators on edge devices with FPGAs.
no code implementations • 24 Jul 2020 • Perry Gibson, José Cano
Optimising deep learning inference across edge devices and optimisation targets such as inference time, memory footprint and power consumption is a key challenge due to the ubiquity of neural networks.
1 code implementation • 17 Jun 2020 • Perry Gibson, José Cano, Jack Turner, Elliot J. Crowley, Michael O'Boyle, Amos Storkey
We observe that our new implementation scales well with the number of groups and provides the best inference times in all settings, improving the existing implementations of grouped convolutions in TVM, PyTorch and TensorFlow Lite by 3. 4x, 8x and 4x on average respectively.