Search Results for author: George A. Constantinides

Found 14 papers, 7 papers with code

NeuraLUT: Hiding Neural Network Density in Boolean Synthesizable Functions

no code implementations • 29 Feb 2024 • Marta Andronic, George A. Constantinides

In these works, the boundaries of the neurons coincide with the boundaries of the LUTs.

Quantization

Paper
Add Code

LQER: Low-Rank Quantization Error Reconstruction for LLMs

no code implementations • 4 Feb 2024 • Cheng Zhang, Jianyi Cheng, George A. Constantinides, Yiren Zhao

Post-training quantization of Large Language Models (LLMs) is challenging.

Knowledge Distillation Quantization

Paper
Add Code

Revisiting Block-based Quantisation: What is Important for Sub-8-bit LLM Inference?

1 code implementation • 8 Oct 2023 • Cheng Zhang, Jianyi Cheng, Ilia Shumailov, George A. Constantinides, Yiren Zhao

In this work, we explore the statistical and learning properties of the LLM layer and attribute the bottleneck of LLM quantisation to numerical scaling offsets.

Attribute

Paper
Code

PolyLUT: Learning Piecewise Polynomials for Ultra-Low Latency FPGA LUT-based Inference

1 code implementation • 5 Sep 2023 • Marta Andronic, George A. Constantinides

We show that by using polynomial building blocks, we can achieve the same accuracy using considerably fewer layers of soft logic than by using linear functions, leading to significant latency and area improvements.

Handwritten Digit Recognition Network Intrusion Detection

Paper
Code

FPGA Resource-aware Structured Pruning for Real-Time Neural Networks

no code implementations • 9 Aug 2023 • Benjamin Ramhorst, Vladimir Loncar, George A. Constantinides

Neural networks achieve state-of-the-art performance in image classification, speech recognition, scientific analysis and many more application areas.

Classification Image Classification +4

Paper
Add Code

ATHEENA: A Toolflow for Hardware Early-Exit Network Automation

no code implementations • 17 Apr 2023 • Benjamin Biggs, Christos-Savvas Bouganis, George A. Constantinides

Additionally, the toolflow can achieve a throughput matching the same baseline with as low as $46\%$ of the resources the baseline requires.

Quantization

Paper
Add Code

Abstract Interpretation on E-Graphs

1 code implementation • 17 Mar 2022 • Samuel Coward, George A. Constantinides, Theo Drane

Recent e-graph applications have typically considered concrete semantics of expressions, where the notion of equivalence stems from concrete interpretation of expressions.

Paper
Code

Logic Shrinkage: Learned FPGA Netlist Sparsity for Efficient Neural Network Inference

1 code implementation • 4 Dec 2021 • Erwei Wang, James J. Davis, Georgios-Ilias Stavrou, Peter Y. K. Cheung, George A. Constantinides, Mohamed S. Abdelfattah

To address these issues, we propose logic shrinkage, a fine-grained netlist pruning methodology enabling K to be automatically learned for every LUT in a neural network targeted for FPGA inference.

Efficient Neural Network

Paper
Code

Enabling Binary Neural Network Training on the Edge

2 code implementations • 8 Feb 2021 • Erwei Wang, James J. Davis, Daniele Moro, Piotr Zielinski, Jia Jie Lim, Claudionor Coelho, Satrajit Chatterjee, Peter Y. K. Cheung, George A. Constantinides

The ever-growing computational demands of increasingly complex machine learning models frequently necessitate the use of powerful cloud-based infrastructure for their training.

Quantization

522

Paper
Code

Horizon-independent Preconditioner Design for Linear Predictive Control

no code implementations • 16 Oct 2020 • Ian McInerney, Eric C. Kerrigan, George A. Constantinides

To reduce the number of iterations required, we present a simple method for computing a horizon-independent preconditioning matrix for the Hessian of the condensed problem.

Optimization and Control Systems and Control Systems and Control

Paper
Add Code

LUTNet: Learning FPGA Configurations for Highly Efficient Neural Network Inference

2 code implementations • 24 Oct 2019 • Erwei Wang, James J. Davis, Peter Y. K. Cheung, George A. Constantinides

Research has shown that deep neural networks contain significant redundancy, and thus that high classification accuracy can be achieved even when weights and activations are quantized down to binary values.

Binarization Efficient Neural Network

Paper
Code

Rethinking Arithmetic for Deep Neural Networks

no code implementations • 7 May 2019 • George A. Constantinides

In general, our results suggest that it is valuable to consider Boolean circuits as neural networks, leading to the question of which circuit topologies are promising.

Paper
Add Code

LUTNet: Rethinking Inference in FPGA Soft Logic

2 code implementations • 1 Apr 2019 • Erwei Wang, James J. Davis, Peter Y. K. Cheung, George A. Constantinides

Research has shown that deep neural networks contain significant redundancy, and that high classification accuracies can be achieved even when weights and activations are quantised down to binary values.

Paper
Code

Deep Neural Network Approximation for Custom Hardware: Where We've Been, Where We're Going

no code implementations • 21 Jan 2019 • Erwei Wang, James J. Davis, Ruizhe Zhao, Ho-Cheung Ng, Xinyu Niu, Wayne Luk, Peter Y. K. Cheung, George A. Constantinides

Deep neural networks have proven to be particularly effective in visual and audio recognition tasks.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.