Search Results for author: Luca Benini

Found 153 papers, 57 papers with code

Multi-resolution Rescored ByteTrack for Video Object Detection on Ultra-low-power Embedded Systems

1 code implementation • 17 Apr 2024 • Luca Bompani, Manuele Rusci, Daniele Palossi, Francesco Conti, Luca Benini

This paper introduces Multi-Resolution Rescored Byte-Track (MR2-ByteTrack), a novel video object detection framework for ultra-low-power embedded processors.

Paper
Code

Foundation Models for Structural Health Monitoring

1 code implementation • 3 Apr 2024 • Luca Benfenati, Daniele Jahier Pagliari, Luca Zanatta, Yhorman Alexander Bedoya Velez, Andrea Acquaviva, Massimo Poncino, Enrico Macii, Luca Benini, Alessio Burrello

For AD, we achieve a near-perfect 99. 9% accuracy with a monitoring time span of just 15 windows.

Anomaly Detection Knowledge Distillation

Paper
Code

Optimizing the Deployment of Tiny Transformers on Low-Power MCUs

1 code implementation • 3 Apr 2024 • Victor J. B. Jung, Alessio Burrello, Moritz Scherer, Francesco Conti, Luca Benini

Moreover, we show that our MHSA depth-first tiling scheme reduces the memory peak by up to 6. 19x, while the fused-weight attention can reduce the runtime by 1. 53x, and number of parameters by 25%.

Hand Gesture Recognition Hand-Gesture Recognition

Paper
Code

12 mJ per Class On-Device Online Few-Shot Class-Incremental Learning

1 code implementation • 12 Mar 2024 • Yoga Esa Wibowo, Cristian Cioflan, Thorir Mar Ingolfsson, Michael Hersche, Leo Zhao, Abbas Rahimi, Luca Benini

In this work, we introduce Online Few-Shot Class-Incremental Learning (O-FSCIL), based on a lightweight model consisting of a pretrained and metalearned feature extractor and an expandable explicit memory storing the class prototypes.

Few-Shot Class-Incremental Learning Incremental Learning

Paper
Code

Boosting keyword spotting through on-device learnable user speech characteristics

no code implementations • 12 Mar 2024 • Cristian Cioflan, Lukas Cavigelli, Luca Benini

Keyword spotting systems for always-on TinyML-constrained applications require on-site tuning to boost the accuracy of offline trained classifiers when deployed in unseen inference conditions.

Few-Shot Learning Keyword Spotting

Paper
Add Code

On-Device Domain Learning for Keyword Spotting on Low-Power Extreme Edge Embedded Systems

no code implementations • 12 Mar 2024 • Cristian Cioflan, Lukas Cavigelli, Manuele Rusci, Miguel de Prado, Luca Benini

Keyword spotting accuracy degrades when neural networks are exposed to noisy environments.

Domain Adaptation Keyword Spotting

Paper
Add Code

SzCORE: A Seizure Community Open-source Research Evaluation framework for the validation of EEG-based automated seizure detection algorithms

3 code implementations • 20 Feb 2024 • Jonathan Dan, Una Pale, Alireza Amirshahi, William Cappelletti, Thorir Mar Ingolfsson, Xiaying Wang, Andrea Cossettini, Adriano Bernini, Luca Benini, Sándor Beniczky, David Atienza, Philippe Ryvlin

Based on existing guidelines and recommendations, the framework introduces a set of recommendations and standards related to datasets, file formats, EEG data input content, seizure annotation input and output, cross-validation strategies, and performance metrics.

EEG Seizure Detection

Paper
Code

A Noisy Beat is Worth 16 Words: a Tiny Transformer for Low-Power Arrhythmia Classification on Microcontrollers

no code implementations • 16 Feb 2024 • Paola Busia, Matteo Antonio Scrugli, Victor Jean-Baptiste Jung, Luca Benini, Paolo Meloni

Wearable systems for the long-term monitoring of cardiovascular diseases are becoming widespread and valuable assets in diagnosis and therapy.

Paper
Add Code

A Precision-Optimized Fixed-Point Near-Memory Digital Processing Unit for Analog In-Memory Computing

no code implementations • 12 Feb 2024 • Elena Ferro, Athanasios Vasilopoulos, Corey Lammie, Manuel Le Gallo, Luca Benini, Irem Boybat, Abu Sebastian

Analog In-Memory Computing (AIMC) is an emerging technology for fast and energy-efficient Deep Learning (DL) inference.

Paper
Add Code

Zero-shot Classification using Hyperdimensional Computing

no code implementations • 30 Jan 2024 • Samuele Ruffino, Geethan Karunaratne, Michael Hersche, Luca Benini, Abu Sebastian, Abbas Rahimi

Classification based on Zero-shot Learning (ZSL) is the ability of a model to classify inputs into novel classes on which the model has not previously seen any training examples.

Ranked #3 on Zero-Shot Learning on CUB-200-2011

Attribute Attribute Extraction +2

Paper
Add Code

A Heterogeneous RISC-V based SoC for Secure Nano-UAV Navigation

no code implementations • 7 Jan 2024 • Luca Valente, Alessandro Nadalini, Asif Veeran, Mattia Sinigaglia, Bruno Sa, Nils Wistoff, Yvan Tortorella, Simone Benatti, Rafail Psiakis, Ari Kulmala, Baker Mohammad, Sandro Pinto, Daniele Palossi, Luca Benini, Davide Rossi

To the best of the authors' knowledge, it is the first silicon prototype of a ULP SoC coupling the RV64 and RV32 cores in a heterogeneous host+accelerator architecture fully based on the RISC-V ISA.

Paper
Add Code

TCNCA: Temporal Convolution Network with Chunked Attention for Scalable Sequence Processing

no code implementations • 9 Dec 2023 • Aleksandar Terzic, Michael Hersche, Geethan Karunaratne, Luca Benini, Abu Sebastian, Abbas Rahimi

We build upon their approach by replacing the linear recurrence with a special temporal convolutional network which permits larger receptive field size with shallower networks, and reduces the computational complexity to $O(L)$.

Language Modelling

Paper
Add Code

MIMONets: Multiple-Input-Multiple-Output Neural Networks Exploiting Computation in Superposition

1 code implementation • NeurIPS 2023 • Nicolas Menet, Michael Hersche, Geethan Karunaratne, Luca Benini, Abu Sebastian, Abbas Rahimi

MIMONets augment various deep neural network architectures with variable binding mechanisms to represent an arbitrary number of inputs in a compositional data structure via fixed-width distributed representations.

Paper
Code

A Survey on Design Methodologies for Accelerating Deep Learning on Heterogeneous Architectures

no code implementations • 29 Nov 2023 • Fabrizio Ferrandi, Serena Curzel, Leandro Fiorin, Daniele Ielmini, Cristina Silvano, Francesco Conti, Alessio Burrello, Francesco Barchi, Luca Benini, Luciano Lavagno, Teodoro Urso, Enrico Calore, Sebastiano Fabio Schifano, Cristian Zambelli, Maurizio Palesi, Giuseppe Ascia, Enrico Russo, Nicola Petra, Davide De Caro, Gennaro Di Meo, Valeria Cardellini, Salvatore Filippone, Francesco Lo Presti, Francesco Silvestri, Paolo Palazzari, Stefania Perri

This survey provides a holistic review of the most influential design methodologies and EDA tools proposed in recent years to implement Deep Learning accelerators, offering the reader a wide perspective in this rapidly evolving field.

Paper
Add Code

Stella Nera: Achieving 161 TOp/s/W with Multiplier-free DNN Acceleration based on Approximate Matrix Multiplication

1 code implementation • 16 Nov 2023 • Jannis Schönleber, Lukas Cavigelli, Renzo Andri, Matteo Perotti, Luca Benini

From classical HPC to deep learning, MatMul is at the heart of today's computing.

Quantization

199

Paper
Code

Quantitative Evaluation of a Multi-Modal Camera Setup for Fusing Event Data with RGB Images

no code implementations • 3 Nov 2023 • Julian Moosmann, Jakub Mandula, Philipp Mayer, Luca Benini, Michele Magno

This work quantitatively evaluates a multi-modal camera setup for fusing high-resolution DVS data with RGB image data by static camera alignment.

Autonomous Driving object-detection +1

Paper
Add Code

Ultra-Efficient On-Device Object Detection on AI-Integrated Smart Glasses with TinyissimoYOLO

no code implementations • 2 Nov 2023 • Julian Moosmann, Pietro Bonazzi, Yawei Li, Sizhen Bian, Philipp Mayer, Luca Benini, Michele Magno

To this goal, we designed a smart glasses prototype as a research platform featuring two microcontrollers, including a novel milliwatt-power RISC-V parallel processor with a hardware accelerator for visual AI, and a Bluetooth low-power module for communication.

Ranked #1 on Object Detection on PASCAL VOC

Benchmarking Edge-computing +3

Paper
Add Code

Enhancing Neural Architecture Search with Multiple Hardware Constraints for Deep Learning Model Deployment on Tiny IoT Devices

1 code implementation • 11 Oct 2023 • Alessio Burrello, Matteo Risso, Beatrice Alessandra Motetti, Enrico Macii, Luca Benini, Daniele Jahier Pagliari

The rapid proliferation of computing domains relying on Internet of Things (IoT) devices has created a pressing need for efficient and accurate deep-learning (DL) models that can run on low-power devices.

Neural Architecture Search

Paper
Code

Skilog: A Smart Sensor System for Performance Analysis and Biofeedback in Ski Jumping

no code implementations • 25 Sep 2023 • Lukas Schulthess, Thorir Mar Ingolfsson, Marc Nölke, Michele Magno, Luca Benini, Christoph Leitner

In particular, a fine-grained control of the center of gravity in the in-run is essential.

Paper
Add Code

Enhancing Performance, Calibration Time and Efficiency in Brain-Machine Interfaces through Transfer Learning and Wearable EEG Technology

no code implementations • 14 Sep 2023 • Xiaying Wang, Lan Mei, Victor Kartsch, Andrea Cossettini, Luca Benini

The comfortable BMI setup with tiny CNN and TL paves the way to future on-device continual learning, essential for tackling inter-session variability and improving usability.

Continual Learning EEG +1

Paper
Add Code

A Wearable Ultra-Low-Power sEMG-Triggered Ultrasound System for Long-Term Muscle Activity Monitoring

no code implementations • 13 Sep 2023 • Sebastian Frey, Victor Kartsch, Christoph Leitner, Andrea Cossettini, Sergei Vostrikov, Simone Benatti, Luca Benini

Assuming a muscle contraction of 200 ms at a contraction rate of 1 Hz, the proposed approach enables more than 59% energy saving (with a full-system average power consumption of 12. 2 mW) as compared to operating both sEMG and US continuously.

Paper
Add Code

EpiDeNet: An Energy-Efficient Approach to Seizure Detection for Embedded Systems

no code implementations • 28 Aug 2023 • Thorir Mar Ingolfsson, Upasana Chakraborty, Xiaying Wang, Sandor Beniczky, Pauline Ducouret, Simone Benatti, Philippe Ryvlin, Andrea Cossettini, Luca Benini

The EpiDeNet-SSWCE method demonstrates effective and accurate seizure detection performance on heavily imbalanced datasets, while being suited for implementation on energy-constrained platforms.

EEG Seizure Detection +1

Paper
Add Code

Flexible and Fully Quantized Ultra-Lightweight TinyissimoYOLO for Ultra-Low-Power Edge Systems

no code implementations • 12 Jul 2023 • Julian Moosmann, Hanna Mueller, Nicky Zimmerman, Georg Rutishauser, Luca Benini, Michele Magno

With this paper, we demonstrate the suitability and flexibility of TinyissimoYOLO on state-of-the-art detection datasets for real-time ultra-low-power edge inference.

object-detection Object Detection

Paper
Add Code

ITA: An Energy-Efficient Attention and Softmax Accelerator for Quantized Transformers

no code implementations • 7 Jul 2023 • Gamze İslamoğlu, Moritz Scherer, Gianna Paulin, Tim Fischer, Victor J. B. Jung, Angelo Garofalo, Luca Benini

Transformer networks have emerged as the state-of-the-art approach for natural language processing tasks and are gaining popularity in other domains such as computer vision and audio processing.

Quantization

Paper
Add Code

Free Bits: Latency Optimization of Mixed-Precision Quantized Neural Networks on the Edge

no code implementations • 6 Jul 2023 • Georg Rutishauser, Francesco Conti, Luca Benini

Mixed-precision quantization, where a deep neural network's layers are quantized to different precisions, offers the opportunity to optimize the trade-offs between model size, latency, and statistical accuracy beyond what can be achieved with homogeneous-bit-width quantization.

Navigate Quantization

Paper
Add Code

BioGAP: a 10-Core FP-capable Ultra-Low Power IoT Processor, with Medical-Grade AFE and BLE Connectivity for Wearable Biosignal Processing

no code implementations • 4 Jul 2023 • Sebastian Frey, Marco Guermandi, Simone Benatti, Victor Kartsch, Andrea Cossettini, Luca Benini

Wearable biosignal processing applications are driving significant progress toward miniaturized, energy-efficient Internet-of-Things solutions for both clinical and consumer applications.

SSVEP

Paper
Add Code

A Survey on Deep Learning Hardware Accelerators for Heterogeneous HPC Platforms

no code implementations • 27 Jun 2023 • Cristina Silvano, Daniele Ielmini, Fabrizio Ferrandi, Leandro Fiorin, Serena Curzel, Luca Benini, Francesco Conti, Angelo Garofalo, Cristian Zambelli, Enrico Calore, Sebastiano Fabio Schifano, Maurizio Palesi, Giuseppe Ascia, Davide Patti, Stefania Perri, Nicola Petra, Davide De Caro, Luciano Lavagno, Teodoro Urso, Valeria Cardellini, Gian Carlo Cardarilli, Robert Birke

Recent trends in deep learning (DL) imposed hardware accelerators as the most viable solution for several classes of high-performance computing (HPC) applications such as image classification, computer vision, and speech recognition.

Image Classification speech-recognition +1

Paper
Add Code

Energy-efficient Wearable-to-Mobile Offload of ML Inference for PPG-based Heart-Rate Estimation

no code implementations • 8 Jun 2023 • Alessio Burrello, Matteo Risso, Noemi Tomasello, Yukai Chen, Luca Benini, Enrico Macii, Massimo Poncino, Daniele Jahier Pagliari

In this work, we propose a collaborative inference approach that uses both a smartwatch and a connected smartphone to maximize the performance of heart rate (HR) tracking while also maximizing the smartwatch's battery life.

Collaborative Inference Heart rate estimation

Paper
Add Code

Precision-aware Latency and Energy Balancing on Multi-Accelerator Platforms for DNN Inference

1 code implementation • 8 Jun 2023 • Matteo Risso, Alessio Burrello, Giuseppe Maria Sarda, Luca Benini, Enrico Macii, Massimo Poncino, Marian Verhelst, Daniele Jahier Pagliari

The need to execute Deep Neural Networks (DNNs) at low latency and low power at the edge has spurred the development of new heterogeneous Systems-on-Chips (SoCs) encapsulating a diverse set of hardware accelerators.

Quantization

Paper
Code

Reduced Precision Floating-Point Optimization for Deep Neural Network On-Device Learning on MicroControllers

1 code implementation • 30 May 2023 • Davide Nadalini, Manuele Rusci, Luca Benini, Francesco Conti

Enabling On-Device Learning (ODL) for Ultra-Low-Power Micro-Controller Units (MCUs) is a key step for post-deployment adaptation and fine-tuning of Deep Neural Network (DNN) models in future TinyML applications.

Continual Learning Image Classification +1

Paper
Code

ColibriUAV: An Ultra-Fast, Energy-Efficient Neuromorphic Edge Processing UAV-Platform with Event-Based and Frame-Based Cameras

no code implementations • 27 May 2023 • Sizhen Bian, Lukas Schulthess, Georg Rutishauser, Alfio Di Mauro, Luca Benini, Michele Magno

The interest in dynamic vision sensor (DVS)-powered unmanned aerial vehicles (UAV) is raising, especially due to the microsecond-level reaction time of the bio-inspired event sensor, which increases robustness and reduces latency of the perception tasks compared to a RGB camera.

Paper
Add Code

Parallelizing Optical Flow Estimation on an Ultra-Low Power RISC-V Cluster for Nano-UAV Navigation

no code implementations • 22 May 2023 • Jonas Kühne, Michele Magno, Luca Benini

On micro and nano UAVs, real-time calculation of the optical flow is run on low power and resource-constrained microcontroller units (MCUs).

Autonomous Navigation Optical Flow Estimation

Paper
Add Code

A Fast and Accurate Optical Flow Camera for Resource-Constrained Edge Applications

no code implementations • 22 May 2023 • Jonas Kühne, Michele Magno, Luca Benini

The paper characterizes the optical flow sensor in high frame-rate, low-latency settings, with a frame rate of up to 88 fps at the full resolution of 1124 by 1364 pixels and up to 240 fps at a reduced camera resolution of 280 by 336, for both classical camera images and optical flow data.

Optical Flow Estimation

Paper
Add Code

Marsellus: A Heterogeneous RISC-V AI-IoT End-Node SoC with 2-to-8b DNN Acceleration and 30%-Boost Adaptive Body Biasing

1 code implementation • 15 May 2023 • Francesco Conti, Gianna Paulin, Angelo Garofalo, Davide Rossi, Alfio Di Mauro, Georg Rutishauser, Gianmarco Ottavi, Manuel Eggimann, Hayate Okuhara, Luca Benini

We present Marsellus, an all-digital heterogeneous SoC for AI-IoT end-nodes fabricated in GlobalFoundries 22nm FDX that combines 1) a general-purpose cluster of 16 RISC-V Digital Signal Processing (DSP) cores attuned for the execution of a diverse range of workloads exploiting 4-bit and 2-bit arithmetic extensions (XpulpNN), combined with fused MAC&LOAD operations and floating-point support; 2) a 2-8bit Reconfigurable Binary Engine (RBE) to accelerate 3x3 and 1x1 (pointwise) convolutions in DNNs; 3) a set of On-Chip Monitoring (OCM) blocks connected to an Adaptive Body Biasing (ABB) generator and a hardware control loop, enabling on-the-fly adaptation of transistor threshold voltages.

408

Paper
Code

SALSA: Simulated Annealing based Loop-Ordering Scheduler for DNN Accelerators

1 code implementation • 20 Apr 2023 • Victor J. B. Jung, Arne Symons, Linyan Mei, Marian Verhelst, Luca Benini

To meet the growing need for computational power for DNNs, multiple specialized hardware architectures have been proposed.

Paper
Code

Neuromorphic Optical Flow and Real-time Implementation with Event Cameras

no code implementations • 14 Apr 2023 • Yannick Schnider, Stanislaw Wozniak, Mathias Gehrig, Jules Lecomte, Axel von Arnim, Luca Benini, Davide Scaramuzza, Angeliki Pantazi

Optical flow provides information on relative motion that is an important component in many computer vision pipelines.

Event-based vision Optical Flow Estimation

Paper
Add Code

Factorizers for Distributed Sparse Block Codes

no code implementations • 24 Mar 2023 • Michael Hersche, Aleksandar Terzic, Geethan Karunaratne, Jovin Langenegger, Angéline Pouget, Giovanni Cherubini, Luca Benini, Abu Sebastian, Abbas Rahimi

We provide a methodology to flexibly integrate our factorizer in the classification layer of CNNs with a novel loss function.

Attribute

Paper
Add Code

Hybrid Modular Redundancy: Exploring Modular Redundancy Approaches in RISC-V Multi-Core Computing Clusters for Reliable Processing in Space

no code implementations • 15 Mar 2023 • Michael Rogenmoser, Yvan Tortorella, Davide Rossi, Francesco Conti, Luca Benini

To mitigate the overheads of traditional radiation hardening and modular redundancy approaches, we present a novel Hybrid Modular Redundancy (HMR) approach, a redundancy scheme that features a cluster of RISC-V processors with a flexible on-demand dual-core and triple-core lockstep grouping of computing cores with runtime split-lock capabilities.

Paper
Add Code

Deep Neural Network Architecture Search for Accurate Visual Pose Estimation aboard Nano-UAVs

no code implementations • 3 Mar 2023 • Elia Cereda, Luca Crupi, Matteo Risso, Alessio Burrello, Luca Benini, Alessandro Giusti, Daniele Jahier Pagliari, Daniele Palossi

In this work, we leverage a novel neural architecture search (NAS) technique to automatically identify several Pareto-optimal convolutional neural networks (CNNs) for a visual pose estimation task.

Neural Architecture Search Pose Estimation

Paper
Add Code

Experimenting with Emerging RISC-V Systems for Decentralised Machine Learning

1 code implementation • 15 Feb 2023 • Gianluca Mittone, Nicolò Tonci, Robert Birke, Iacopo Colonnelli, Doriana Medić, Andrea Bartolini, Roberto Esposito, Emanuele Parisi, Francesco Beneventi, Mirko Polato, Massimo Torquati, Luca Benini, Marco Aldinucci

Federated Learning (FL) and Edge Inference are examples of DML.

Federated Learning

Paper
Code

Lightweight Neural Architecture Search for Temporal Convolutional Networks at the Edge

1 code implementation • 24 Jan 2023 • Matteo Risso, Alessio Burrello, Francesco Conti, Lorenzo Lamberti, Yukai Chen, Luca Benini, Enrico Macii, Massimo Poncino, Daniele Jahier Pagliari

Neural Architecture Search (NAS) is quickly becoming the go-to approach to optimize the structure of Deep Learning (DL) models for complex tasks such as Image Classification or Object Detection.

Image Classification Neural Architecture Search +4

Paper
Code

RedMule: A Mixed-Precision Matrix-Matrix Operation Engine for Flexible and Energy-Efficient On-Chip Linear Algebra and TinyML Training Acceleration

1 code implementation • 10 Jan 2023 • Yvan Tortorella, Luca Bertaccini, Luca Benini, Davide Rossi, Francesco Conti

The increasing interest in TinyML, i. e., near-sensor machine learning on power budgets of a few tens of mW, is currently pushing toward enabling TinyML-class training as opposed to inference only.

Paper
Code

Self-sustaining Ultra-wideband Positioning System for Event-driven Indoor Localization

no code implementations • 9 Dec 2022 • Philipp Mayer, Michele Magno, Luca Benini

The energy consumption for position updates, with an accuracy of $40~cm$ (2D) in realistic non-line-of-sight conditions, is $10. 84~mJ$.

Indoor Localization Motion Detection +2

Paper
Add Code

Fully On-board Low-Power Localization with Multizone Time-of-Flight Sensors on Nano-UAVs

1 code implementation • 25 Nov 2022 • Hanna Müller, Nicky Zimmerman, Tommaso Polonelli, Michele Magno, Jens Behley, Cyrill Stachniss, Luca Benini

Experimental evaluation using a nano-UAV open platform demonstrated that the proposed solution is capable of localizing on a 31. 2m$\boldsymbol{^2}$ map with 0. 15m accuracy and an above 95% success rate.

Paper
Code

In-memory factorization of holographic perceptual representations

1 code implementation • 9 Nov 2022 • Jovin Langenegger, Geethan Karunaratne, Michael Hersche, Luca Benini, Abu Sebastian, Abbas Rahimi

Disentanglement of constituent factors of a sensory signal is central to perception and cognition and hence is a critical task for future artificial intelligence systems.

Disentanglement

Paper
Code

An Energy-Efficient Spiking Neural Network for Finger Velocity Decoding for Implantable Brain-Machine Interface

1 code implementation • 7 Oct 2022 • Jiawei Liao, Lars Widmer, Xiaying Wang, Alfio Di Mauro, Samuel R. Nason-Tomaszewski, Cynthia A. Chestek, Luca Benini, Taekwang Jang

Brain-machine interfaces (BMIs) are promising for motor rehabilitation and mobility augmentation.

regression

Paper
Code

RUAD: unsupervised anomaly detection in HPC systems

no code implementations • 28 Aug 2022 • Martin Molan, Andrea Borghesi, Daniele Cesarini, Luca Benini, Andrea Bartolini

However, current state-of-the-art (SoA) approaches to anomaly detection are supervised and semi-supervised, so they require a human-labelled dataset with anomalies - this is often impractical to collect in production HPC systems.

Clustering Unsupervised Anomaly Detection

Paper
Add Code

In-memory Realization of In-situ Few-shot Continual Learning with a Dynamically Evolving Explicit Memory

no code implementations • 14 Jul 2022 • Geethan Karunaratne, Michael Hersche, Jovin Langenegger, Giovanni Cherubini, Manuel Le Gallo-Bourdeau, Urs Egger, Kevin Brew, Sam Choi, INJO OK, Mary Claire Silvestre, Ning li, Nicole Saulnier, Victor Chan, Ishtiaq Ahsan, Vijay Narayanan, Luca Benini, Abu Sebastian, Abbas Rahimi

We demonstrate for the first time how the EM unit can physically superpose multiple training examples, expand to accommodate unseen classes, and perform similarity search during inference, using operations on an IMC core based on phase-change memory (PCM).

Continual Learning

Paper
Add Code

Channel-wise Mixed-precision Assignment for DNN Inference on Constrained Edge Nodes

1 code implementation • 17 Jun 2022 • Matteo Risso, Alessio Burrello, Luca Benini, Enrico Macii, Massimo Poncino, Daniele Jahier Pagliari

Quantization is widely employed in both cloud and edge systems to reduce the memory occupation, latency, and energy consumption of deep neural networks.

Neural Architecture Search Quantization

Paper
Code

Towards On-device Domain Adaptation for Noise-Robust Keyword Spotting

1 code implementation • IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS) 2022 • Cristian Cioflan, Lukas Cavigelli, Manuele Rusci, Miguel de Prado, Luca Benini

The accuracy of a keyword spotting model deployed on embedded devices often degrades when the system is exposed to real environments with significant noise.

Edge-computing Keyword Spotting

Paper
Code

Multi-Complexity-Loss DNAS for Energy-Efficient and Memory-Constrained Deep Neural Networks

1 code implementation • 1 Jun 2022 • Matteo Risso, Alessio Burrello, Luca Benini, Enrico Macii, Massimo Poncino, Daniele Jahier Pagliari

When deployed on a commercial edge device, the STM NUCLEO-H743ZI2, our networks span a range of 2. 18x in energy consumption and 4. 04% in accuracy for the same memory constraint, and reduce energy by up to 2. 2x with negligible accuracy drop with respect to the baseline.

Neural Architecture Search

Paper
Code

Adaptive Random Forests for Energy-Efficient Inference on Microcontrollers

no code implementations • 27 May 2022 • Francesco Daghero, Alessio Burrello, Chen Xie, Luca Benini, Andrea Calimera, Enrico Macii, Massimo Poncino, Daniele Jahier Pagliari

The accuracy of a RF often increases with the number of internal weak learners (decision trees), but at the cost of a proportional increase in inference latency and energy consumption.

Paper
Add Code

Aerosense: A Self-Sustainable And Long-Range Bluetooth Wireless Sensor Node for Aerodynamic and Aeroacoustic Monitoring on Wind Turbines

no code implementations • 24 May 2022 • Tommaso Polonelli, Hanna Müller, Weikang Kong, Raphael Fischer, Luca Benini, Michele Magno

This paper presents a low-power, self-sustainable, and modular wireless sensor node for aerodynamic and acoustic measurements on wind turbines and other industrial structures.

Data Compression

Paper
Add Code

Reducing Neural Architecture Search Spaces with Training-Free Statistics and Computational Graph Clustering

no code implementations • 29 Apr 2022 • Thorir Mar Ingolfsson, Mark Vero, Xiaying Wang, Lorenzo Lamberti, Luca Benini, Matteo Spallanzani

The computational demands of neural architecture search (NAS) algorithms are usually directly proportional to the size of their target search spaces.

Clustering Graph Clustering +1

Paper
Add Code

Energy-Efficient Tree-Based EEG Artifact Detection

no code implementations • 19 Apr 2022 • Thorir Mar Ingolfsson, Andrea Cossettini, Simone Benatti, Luca Benini

In this work we present the implementation of an artifact detection algorithm based on a minimal number of EEG channels on a parallel ultra-low-power (PULP) embedded platform.

Artifact Detection EEG +1

Paper
Add Code

Energy-Efficient Adaptive Machine Learning on IoT End-Nodes With Class-Dependent Confidence

no code implementations • 7 Apr 2022 • Francesco Daghero, Alessio Burrello, Daniele Jahier Pagliari, Luca Benini, Enrico Macii, Massimo Poncino

Energy-efficient machine learning models that can run directly on edge devices are of great interest in IoT applications, as they can reduce network pressure and response latency, and improve privacy.

BIG-bench Machine Learning

Paper
Add Code

Constrained Few-shot Class-incremental Learning

2 code implementations • CVPR 2022 • Michael Hersche, Geethan Karunaratne, Giovanni Cherubini, Luca Benini, Abu Sebastian, Abbas Rahimi

Moreover, it is imperative that such learning must respect certain memory and computational constraints such as (i) training samples are limited to only a few per class, (ii) the computational cost of learning a novel class remains constant, and (iii) the memory footprint of the model grows at most linearly with the number of classes observed.

Ranked #4 on Few-Shot Class-Incremental Learning on mini-Imagenet

continual few-shot learning Few-Shot Class-Incremental Learning +1

Paper
Code

Pruning In Time (PIT): A Lightweight Network Architecture Optimizer for Temporal Convolutional Networks

1 code implementation • 28 Mar 2022 • Matteo Risso, Alessio Burrello, Daniele Jahier Pagliari, Francesco Conti, Lorenzo Lamberti, Enrico Macii, Luca Benini, Massimo Poncino

Temporal Convolutional Networks (TCNs) are promising Deep Learning models for time-series processing tasks.

Time Series Time Series Analysis

Paper
Code

Robust and Energy-efficient PPG-based Heart-Rate Monitoring

no code implementations • 28 Mar 2022 • Matteo Risso, Alessio Burrello, Daniele Jahier Pagliari, Simone Benatti, Enrico Macii, Luca Benini, Massimo Poncino

A wrist-worn PPG sensor coupled with a lightweight algorithm can run on a MCU to enable non-invasive and comfortable monitoring, but ensuring robust PPG-based heart-rate monitoring in the presence of motion artifacts is still an open challenge.

Neural Architecture Search

Paper
Add Code

MI-BMInet: An Efficient Convolutional Neural Network for Motor Imagery Brain--Machine Interfaces with EEG Channel Selection

no code implementations • 28 Mar 2022 • Xiaying Wang, Michael Hersche, Michele Magno, Luca Benini

A brain--machine interface (BMI) based on motor imagery (MI) enables the control of devices using brain signals while the subject imagines performing a movement.

EEG Motor Imagery

Paper
Add Code

TCN Mapping Optimization for Ultra-Low Power Time-Series Edge Inference

no code implementations • 24 Mar 2022 • Alessio Burrello, Alberto Dequino, Daniele Jahier Pagliari, Francesco Conti, Marcello Zanghieri, Enrico Macii, Luca Benini, Massimo Poncino

Temporal Convolutional Networks (TCNs) are emerging lightweight Deep Learning models for Time Series analysis.

Time Series Time Series Analysis

Paper
Add Code

Bioformers: Embedding Transformers for Ultra-Low Power sEMG-based Gesture Recognition

no code implementations • 24 Mar 2022 • Alessio Burrello, Francesco Bianco Morghet, Moritz Scherer, Simone Benatti, Luca Benini, Enrico Macii, Massimo Poncino, Daniele Jahier Pagliari

Human-machine interaction is gaining traction in rehabilitation tasks, such as controlling prosthetic hands or robotic arms.

Gesture Recognition

Paper
Add Code

Q-PPG: Energy-Efficient PPG-based Heart Rate Monitoring on Wearable Devices

1 code implementation • 24 Mar 2022 • Alessio Burrello, Daniele Jahier Pagliari, Matteo Risso, Simone Benatti, Enrico Macii, Luca Benini, Massimo Poncino

Our most accurate quantized network achieves 4. 41 Beats Per Minute (BPM) of Mean Absolute Error (MAE), with an energy consumption of 47. 65 mJ and a memory footprint of 412 kB.

Neural Architecture Search Photoplethysmography (PPG)

Paper
Code

Training Quantised Neural Networks with STE Variants: the Additive Noise Annealing Algorithm

no code implementations • CVPR 2022 • Matteo Spallanzani, Gian Paolo Leonardi, Luca Benini

When testing ANA on the CIFAR-10 image classification benchmark, we find that the major impact on task accuracy is not due to the qualitative shape of the regularisations but to the proper synchronisation of the different STE variants used in a network, in accordance with the theoretical results.

Image Classification

Paper
Add Code

A Neuro-vector-symbolic Architecture for Solving Raven's Progressive Matrices

1 code implementation • 9 Mar 2022 • Michael Hersche, Mustafa Zeqiri, Luca Benini, Abu Sebastian, Abbas Rahimi

Compared to state-of-the-art deep neural network and neuro-symbolic approaches, end-to-end training of NVSA achieves a new record of 87. 7% average accuracy in RAVEN, and 88. 1% in I-RAVEN datasets.

Logical Reasoning

Paper
Code

Exploring Scalable, Distributed Real-Time Anomaly Detection for Bridge Health Monitoring

1 code implementation • 4 Mar 2022 • Amirhossein Moallemi, Alessio Burrello, Davide Brunelli, Luca Benini

Modern real-time Structural Health Monitoring systems can generate a considerable amount of information that must be processed and evaluated for detecting early anomalies and generating prompt warnings and alarms about the civil infrastructure conditions.

Anomaly Detection Cloud Computing

Paper
Code

Embedding Temporal Convolutional Networks for Energy-Efficient PPG-Based Heart Rate Monitoring

no code implementations • 1 Mar 2022 • Alessio Burrello, Daniele Jahier Pagliari, Pierangelo Maria Rapa, Matilde Semilia, Matteo Risso, Tommaso Polonelli, Massimo Poncino, Luca Benini, Simone Benatti

Photoplethysmography (PPG) sensors allow for non-invasive and comfortable heart-rate (HR) monitoring, suitable for compact wrist-worn devices.

Neural Architecture Search Photoplethysmography (PPG)

Paper
Add Code

Vau da muntanialas: Energy-efficient multi-die scalable acceleration of RNN inference

no code implementations • 14 Feb 2022 • Gianna Paulin, Francesco Conti, Lukas Cavigelli, Luca Benini

For quantifying the overall system power, including I/O power, we built Vau da Muntanialas, to the best of our knowledge, the first demonstration of a systolic multi-chip-on-PCB array of RNN accelerator.

Quantization speech-recognition +2

Paper
Add Code

GVSoC: A Highly Configurable, Fast and Accurate Full-Platform Simulator for RISC-V based IoT Processors

1 code implementation • 20 Jan 2022 • Nazareno Bruschi, Germain Haugou, Giuseppe Tagliavini, Francesco Conti, Luca Benini, Davide Rossi

The last few years have seen the emergence of IoT processors: ultra-low power systems-on-chips (SoCs) combining lightweight and flexible micro-controller units (MCUs), often based on open-ISA RISC-V cores, with application-specific accelerators to maximize performance and energy efficiency.

Paper
Code

A Heterogeneous In-Memory Computing Cluster For Flexible End-to-End Inference of Real-World Deep Neural Networks

no code implementations • 4 Jan 2022 • Angelo Garofalo, Gianmarco Ottavi, Francesco Conti, Geethan Karunaratne, Irem Boybat, Luca Benini, Davide Rossi

Furthermore, we explore the requirements for end-to-end inference of a full mobile-grade DNN (MobileNetV2) in terms of IMC array resources, by scaling up our heterogeneous architecture to a multi-array accelerator.

Paper
Add Code

Sub-100uW Multispectral Riemannian Classification for EEG-based Brain--Machine Interfaces

no code implementations • 18 Dec 2021 • Xiaying Wang, Lukas Cavigelli, Tibor Schneider, Luca Benini

Motor imagery brain--machine interfaces enable us to control machines by merely thinking of performing a motor action.

Classification EEG +1

Paper
Add Code

A TinyML Platform for On-Device Continual Learning with Quantized Latent Replays

no code implementations • 20 Oct 2021 • Leonardo Ravaglia, Manuele Rusci, Davide Nadalini, Alessandro Capotondi, Francesco Conti, Luca Benini

In this work, we introduce a HW/SW platform for end-to-end CL based on a 10-core FP32-enabled parallel ultra-low-power (PULP) processor.

Continual Learning Quantization

Paper
Add Code

Vega: A 10-Core SoC for IoT End-Nodes with DNN Acceleration and Cognitive Wake-Up From MRAM-Based State-Retentive Sleep Mode

no code implementations • 18 Oct 2021 • Davide Rossi, Francesco Conti, Manuel Eggimann, Alfio Di Mauro, Giuseppe Tagliavini, Stefan Mach, Marco Guermandi, Antonio Pullini, Igor Loi, Jie Chen, Eric Flamand, Luca Benini

Vega achieves SoA-leading efficiency of 615 GOPS/W on 8-bit INT computation (boosted to 1. 3TOPS/W for 8-bit DNN inference with hardware acceleration).

Management

Paper
Add Code

Practical Adversarial Attacks on Brain--Computer Interfaces

no code implementations • 29 Sep 2021 • Rodolfo Octavio Siller Quintanilla, Xiaying Wang, Michael Hersche, Luca Benini, Gagandeep Singh

We propose new methods to induce denial-of-service attacks and incorporate domain-specific insights and constraints to accomplish two key goals: (i) create smooth adversarial attacks that are physiologically plausible; (ii) consider the realistic case where the attack happens at the origin of the signal acquisition and it propagates on the human head.

EEG

Paper
Add Code

A Fully-Integrated 5mW, 0.8Gbps Energy-Efficient Chip-to-Chip Data Link for Ultra-Low-Power IoT End-Nodes in 65-nm CMOS

no code implementations • 5 Sep 2021 • Hayate Okuhara, Ahmed Elnaqib, Martino Dazzi, Pierpaolo Palestri, Simone Benatti, Luca Benini, Davide Rossi

The increasing complexity of Internet-of-Things (IoT) applications and near-sensor processing algorithms is pushing the computational power of low-power, battery-operated end-node systems.

Paper
Add Code

Memory-Aware Partitioning of Machine Learning Applications for Optimal Energy Use in Batteryless Systems

no code implementations • 5 Aug 2021 • Andres Gomez, Andreas Tretter, Pascal Alexander Hager, Praveenth Sanmugarajah, Luca Benini, Lothar Thiele

By leveraging interkernel data dependencies, these energy-bounded execution cycles minimize the number of system activations and nonvolatile data transfers, and thus the total energy overhead.

Total Energy

Paper
Add Code

A Construction Kit for Efficient Low Power Neural Network Accelerator Designs

no code implementations • 24 Jun 2021 • Petar Jokic, Erfan Azarkhish, Andrea Bonetti, Marc Pons, Stephane Emery, Luca Benini

This work provides a survey of neural network accelerator optimization approaches that have been used in recent works and reports their individual effects on edge processing performance.

Paper
Add Code

NN2CAM: Automated Neural Network Mapping for Multi-Precision Edge Processing on FPGA-Based Cameras

no code implementations • 24 Jun 2021 • Petar Jokic, Stephane Emery, Luca Benini

The record-breaking achievements of deep neural networks (DNNs) in image classification and detection tasks resulted in a surge of new computer vision applications during the past years.

Image Classification

Paper
Add Code

Towards Long-term Non-invasive Monitoring for Epilepsy via Wearable EEG Devices

no code implementations • 15 Jun 2021 • Thorir Mar Ingolfsson, Andrea Cossettini, Xiaying Wang, Enrico Tabanelli, Giuseppe Tagliavini, Philippe Ryvlin, Luca Benini, Simone Benatti

We present the implementation of seizure detection algorithms based on a minimal number of EEG channels on a parallel ultra-low-power embedded platform.

EEG Seizure Detection

Paper
Add Code

Trimming Feature Extraction and Inference for MCU-based Edge NILM: a Systematic Approach

no code implementations • 21 May 2021 • Enrico Tabanelli, Davide Brunelli, Andrea Acquaviva, Luca Benini

State-of-the-Art approaches are based on Machine Learning methods and exploit the fusion of time- and frequency-domain features from current and voltage sensors.

Non-Intrusive Load Monitoring

Paper
Add Code

Implementing CNN Layers on the Manticore Cluster-Based Many-Core Architecture

no code implementations • 16 Apr 2021 • Andreas Kurth, Fabian Schuiki, Luca Benini

This document presents implementations of fundamental convolutional neural network (CNN) layers on the Manticore cluster-based many-core architecture and discusses their characteristics and trade-offs.

Paper
Add Code

Enabling Design Methodologies and Future Trends for Edge AI: Specialization and Co-design

no code implementations • 25 Mar 2021 • Cong Hao, Jordan Dotzel, JinJun Xiong, Luca Benini, Zhiru Zhang, Deming Chen

Artificial intelligence (AI) technologies have dramatically advanced in recent years, resulting in revolutionary changes in people's lives.

Benchmarking Edge-computing

Paper
Add Code

ECG-TCN: Wearable Cardiac Arrhythmia Detection with a Temporal Convolutional Network

1 code implementation • 25 Mar 2021 • Thorir Mar Ingolfsson, Xiaying Wang, Michael Hersche, Alessio Burrello, Lukas Cavigelli, Luca Benini

With 9. 91 GMAC/s/W, it is 23. 0 times more energy-efficient and 46. 85 times faster than an implementation on the ARM Cortex M4F (0. 43 GMAC/s/W).

Arrhythmia Detection

Paper
Code

Mixed-Precision Quantization and Parallel Implementation of Multispectral Riemannian Classification for Brain--Machine Interfaces

1 code implementation • 22 Feb 2021 • Xiaying Wang, Tibor Schneider, Michael Hersche, Lukas Cavigelli, Luca Benini

With Motor-Imagery (MI) Brain--Machine Interfaces (BMIs) we may control machines by merely thinking of performing a motor action.

General Classification Motor Imagery +1

Paper
Code

A 5 μW Standard Cell Memory-based Configurable Hyperdimensional Computing Accelerator for Always-on Smart Sensing

no code implementations • 4 Feb 2021 • Manuel Eggimann, Abbas Rahimi, Luca Benini

Hyperdimensional computing (HDC) is a brain-inspired computing paradigm based on high-dimensional holistic representations of vectors.

EMG Gesture Recognition Fault Detection +1

Paper
Add Code

Sound Event Detection with Binary Neural Networks on Tightly Power-Constrained IoT Devices

no code implementations • 12 Jan 2021 • Gianmarco Cerutti, Renzo Andri, Lukas Cavigelli, Michele Magno, Elisabetta Farella, Luca Benini

This BNN reaches a 77. 9% accuracy, just 7% lower than the full-precision version, with 58 kB (7. 2 times less) for the weights and 262 kB (2. 4 times less) memory in total.

Event Detection Object Recognition +2

Paper
Add Code

CUTIE: Beyond PetaOp/s/W Ternary DNN Inference Acceleration with Better-than-Binary Energy Efficiency

no code implementations • 3 Nov 2020 • Moritz Scherer, Georg Rutishauser, Lukas Cavigelli, Luca Benini

We present a 3. 1 POp/s/W fully digital hardware accelerator for ternary neural networks.

Hardware Architecture

Paper
Add Code

Binarization Methods for Motor-Imagery Brain-Computer Interface Classification

no code implementations • 14 Oct 2020 • Michael Hersche, Luca Benini, Abbas Rahimi

Our first method, based on sparse bipolar random projection, projects a large number of real-valued Riemannian covariance features to a binary space, where a linear SVM classifier can be learned with binary weights too.

Binarization Classification +2

Paper
Add Code

Robust High-dimensional Memory-augmented Neural Networks

no code implementations • 5 Oct 2020 • Geethan Karunaratne, Manuel Schmuck, Manuel Le Gallo, Giovanni Cherubini, Luca Benini, Abu Sebastian, Abbas Rahimi

Traditional neural networks require enormous amounts of data to build their complex mappings during a slow training procedure that hinders their abilities for relearning and adapting to new data.

Few-Shot Image Classification Vocal Bursts Intensity Prediction

Paper
Add Code

Leveraging Automated Mixed-Low-Precision Quantization for tiny edge microcontrollers

no code implementations • 12 Aug 2020 • Manuele Rusci, Marco Fariselli, Alessandro Capotondi, Luca Benini

The severe on-chip memory limitations are currently preventing the deployment of the most accurate Deep Neural Network (DNN) models on tiny MicroController Units (MCUs), even if leveraging an effective 8-bit quantization scheme.

Quantization

Paper
Add Code

Improving Memory Utilization in Convolutional Neural Network Accelerators

no code implementations • 20 Jul 2020 • Petar Jokic, Stephane Emery, Luca Benini

While the accuracy of convolutional neural networks has achieved vast improvements by introducing larger and deeper network architectures, also the memory footprint for storing their parameters and activations has increased.

Paper
Add Code

Always-On 674uW @ 4GOP/s Error Resilient Binary Neural Networks with Aggressive SRAM Voltage Scaling on a 22nm IoT End-Node

no code implementations • 17 Jul 2020 • Alfio Di Mauro, Francesco Conti, Pasquale Davide Schiavone, Davide Rossi, Luca Benini

On a prototype in 22nm FDX technology, we demonstrate that both the logic and SRAM voltage can be dropped to 0. 5Vwithout any accuracy penalty on a BNN trained for the CIFAR-10 dataset, improving energy efficiency by 2. 2X w. r. t.

PICO

Paper
Add Code

A 0.5GHz 0.35mW LDO-Powered Constant-Slope Phase Interpolator with 0.22$\%$ INL

no code implementations • 15 Jul 2020 • Ahmed Elnaqib, Hayate Okuhara, Taekwang Jang, Davide Rossi, Luca Benini

Clock generators are an essential and critical building block of any communication link, whether it be wired or wireless, and they are increasingly critical given the push for lower I/O power and higher bandwidth in Systems-on-Chip (SoCs) for the Internet-of-Things (IoT).

Paper
Add Code

TinyRadarNN: Combining Spatial and Temporal Convolutional Neural Networks for Embedded Gesture Recognition with Short Range Radars

1 code implementation • 25 Jun 2020 • Moritz Scherer, Michele Magno, Jonas Erb, Philipp Mayer, Manuel Eggimann, Luca Benini

Furthermore, the gesture recognition classifier has been implemented on a Parallel Ultra-Low Power Processor, demonstrating that real-time prediction is feasible with only 21 mW of power consumption for the full TCN sequence prediction network, while a system-level power consumption of less than 100 mW is achieved.

Hand Gesture Recognition Hand-Gesture Recognition

Paper
Code

Automated Design Space Exploration for optimised Deployment of DNN on Arm Cortex-A CPUs

no code implementations • 9 Jun 2020 • Miguel de Prado, Andrew Mundy, Rabia Saeed, Maurizio Denna, Nuria Pazos, Luca Benini

The framework relies on a Reinforcement Learning search that, combined with a deep learning inference framework, automatically explores the design space and learns an optimised solution that speeds up the performance and reduces the memory on embedded CPU platforms.

Paper
Add Code

EEG-TCNet: An Accurate Temporal Convolutional Network for Embedded Motor-Imagery Brain-Machine Interfaces

1 code implementation • 31 May 2020 • Thorir Mar Ingolfsson, Michael Hersche, Xiaying Wang, Nobuaki Kobayashi, Lukas Cavigelli, Luca Benini

Experimental results on the BCI Competition IV-2a dataset show that EEG-TCNet achieves 77. 35% classification accuracy in 4-class MI.

EEG General Classification +2

Paper
Code

ChewBaccaNN: A Flexible 223 TOPS/W BNN Accelerator

no code implementations • 12 May 2020 • Renzo Andri, Geethan Karunaratne, Lukas Cavigelli, Luca Benini

Furthermore, it can perform inference on a binarized ResNet-18 trained with 8-bases Group-Net to achieve a 67. 5% Top-1 accuracy with only 3. 0 mJ/frame -- at an accuracy drop of merely 1. 8% from the full-precision ResNet-18.

Paper
Add Code

Optimizing Temporal Convolutional Network inference on FPGA-based accelerators

no code implementations • 7 May 2020 • Marco Carreras, Gianfranco Deriu, Luigi Raffo, Luca Benini, Paolo Meloni

Convolutional Neural Networks are extensively used in a wide range of applications, commonly including computer vision tasks like image and video classification, recognition, and segmentation.

Scheduling Time Series Analysis +1

Paper
Add Code

Q-EEGNet: an Energy-Efficient 8-bit Quantized Parallel EEGNet Implementation for Edge Motor-Imagery Brain--Machine Interfaces

1 code implementation • 24 Apr 2020 • Tibor Schneider, Xiaying Wang, Michael Hersche, Lukas Cavigelli, Luca Benini

We quantize weights and activations to 8-bit fixed-point with a negligible accuracy loss of 0. 4% on 4-class MI, and present an energy-efficient hardware-aware implementation on the Mr. Wolf parallel ultra-low power (PULP) System-on-Chip (SoC) by utilizing its custom RISC-V ISA extensions and 8-core compute cluster.

EEG Motor Imagery

Paper
Code

LLHD: A Multi-level Intermediate Representation for Hardware Description Languages

1 code implementation • 7 Apr 2020 • Fabian Schuiki, Andreas Kurth, Tobias Grosser, Luca Benini

These tools are monolithic and mostly proprietary, disagree in their implementation of HDLs, and while many redundant IRs exists, no IR today can be used through the entire circuit design flow.

Programming Languages

385

Paper
Code

pAElla: Edge-AI based Real-Time Malware Detection in Data Centers

1 code implementation • 7 Apr 2020 • Antonio Libri, Andrea Bartolini, Luca Benini

The method -- called pAElla -- targets real-time Malware Detection (MD), it runs on an out-of-band IoT-based monitoring system for DCs/SCs, and involves Power Spectral Density of power measurements, along with AutoEncoders.

Anomaly Detection Edge-computing +1

Paper
Code

An Accurate EEGNet-based Motor-Imagery Brain-Computer Interface for Low-Power Edge Computing

no code implementations • 31 Mar 2020 • Xiaying Wang, Michael Hersche, Batuhan Tömekce, Burak Kaya, Michele Magno, Luca Benini

Our novel method further scales down the standard EEGNet at a negligible accuracy loss of 0. 31% with 7. 6x memory footprint reduction and a small accuracy loss of 2. 51% with 15x reduction.

Edge-computing EEG +2

Paper
Add Code

InfiniWolf: Energy Efficient Smart Bracelet for Edge Computing with Dual Source Energy Harvesting

no code implementations • 28 Feb 2020 • Michele Magno, Xiaying Wang, Manuel Eggimann, Lukas Cavigelli, Luca Benini

This work presents InfiniWolf, a novel multi-sensor smartwatch that can achieve self-sustainability exploiting thermal and solar energy harvesting, performing computationally high demanding tasks.

Edge-computing

Paper
Add Code

Extending the RISC-V ISA for Efficient RNN-based 5G Radio Resource Management

no code implementations • 27 Feb 2020 • Renzo Andri, Tomas Henriksson, Luca Benini

Radio Resource Management (RRM) in 5G mobile communication is a challenging problem for which Recurrent Neural Networks (RNN) have shown promising results.

Management

Paper
Add Code

Combining Learning and Optimization for Transprecision Computing

2 code implementations • 24 Feb 2020 • Andrea Borghesi, Giuseppe Tagliavini, Michele Lombardi, Luca Benini, Michela Milano

The ML model learns the relation between variables precision and the output error; this information is then embedded in the MP focused on minimizing the number of bits.

Distributed, Parallel, and Cluster Computing

Paper
Code

RPR: Random Partition Relaxation for Training; Binary and Ternary Weight Neural Networks

no code implementations • 4 Jan 2020 • Lukas Cavigelli, Luca Benini

We present Random Partition Relaxation (RPR), a method for strong quantization of neural networks weight to binary (+1/-1) and ternary (+1/0/-1) values.

Quantization

Paper
Add Code

HR-SAR-Net: A Deep Neural Network for Urban Scene Segmentation from High-Resolution SAR Data

no code implementations • 10 Dec 2019 • Xiaying Wang, Lukas Cavigelli, Manuel Eggimann, Michele Magno, Luca Benini

Synthetic aperture radar (SAR) data is becoming increasingly available to a wide range of users through commercial service providers with resolutions reaching 0. 5m/px.

Scene Segmentation Segmentation

Paper
Add Code

Constrained deep neural network architecture search for IoT devices accounting for hardware calibration

no code implementations • NeurIPS 2019 • Florian Scheidegger, Luca Benini, Costas Bekas, A. Cristiano I. Malossi

The narrow-space search of floating-point models improves the accuracy on CIFAR10 of an established IoT model from 70. 64% to 74. 87% within the same memory constraints.

General Classification Image Classification

Paper
Add Code

FANN-on-MCU: An Open-Source Toolkit for Energy-Efficient Neural Network Inference at the Edge of the Internet of Things

1 code implementation • 8 Nov 2019 • Xiaying Wang, Michele Magno, Lukas Cavigelli, Luca Benini

The growing number of low-power smart devices in the Internet of Things is coupled with the concept of "Edge Computing", that is moving some of the intelligence, especially machine learning, towards the edge of the network.

BIG-bench Machine Learning Edge-computing +1

Paper
Code

Constrained deep neural network architecture search for IoT devices accounting hardware calibration

no code implementations • 24 Sep 2019 • Florian Scheidegger, Luca Benini, Costas Bekas, Cristiano Malossi

We further improve the accuracy to 82. 07% by including 16-bit half types and we obtain the best accuracy of 83. 45% by extending the search with model optimized IEEE 754 reduced types.

General Classification Image Classification

Paper
Add Code

EBPC: Extended Bit-Plane Compression for Deep Neural Network Inference and Training Accelerators

2 code implementations • 30 Aug 2019 • Lukas Cavigelli, Georg Rutishauser, Luca Benini

In the wake of the success of convolutional neural networks in image classification, object recognition, speech recognition, etc., the demand for deploying these compute-intensive ML models on embedded and mobile systems with tight power and energy constraints at low cost, as well as for boosting throughput in data centers, is growing rapidly.

Image Classification Object Recognition +2

Paper
Code

PULP-NN: Accelerating Quantized Neural Networks on Parallel Ultra-Low-Power RISC-V Processors

1 code implementation • 29 Aug 2019 • Angelo Garofalo, Manuele Rusci, Francesco Conti, Davide Rossi, Luca Benini

We present PULP-NN, an optimized computing library for a parallel ultra-low-power tightly coupled cluster of RISC-V processors.

Quantization

Paper
Code

5 Parallel Prism: A topology for pipelined implementations of convolutional neural networks using computational memory

no code implementations • 8 Jun 2019 • Martino Dazzi, Abu Sebastian, Pier Andrea Francese, Thomas Parnell, Luca Benini, Evangelos Eleftheriou

We show that this communication fabric facilitates the pipelined execution of all state of-the-art CNNs by proving the existence of a homomorphism between one graph representation of these networks and the proposed graph topology.

Paper
Add Code

In-memory hyperdimensional computing

no code implementations • 4 Jun 2019 • Geethan Karunaratne, Manuel Le Gallo, Giovanni Cherubini, Luca Benini, Abbas Rahimi, Abu Sebastian

Hyperdimensional computing (HDC) is an emerging computational framework that takes inspiration from attributes of neuronal circuits such as hyperdimensionality, fully distributed holographic representation, and (pseudo)randomness.

Attribute Classification +4

Paper
Add Code

Ara: A 1 GHz+ Scalable and Energy-Efficient RISC-V Vector Processor with Multi-Precision Floating Point Support in 22 nm FD-SOI

no code implementations • 2 Jun 2019 • Matheus Cavalcante, Fabian Schuiki, Florian Zaruba, Michael Schaffner, Luca Benini

In this paper, we present Ara, a 64-bit vector processor based on the version 0. 5 draft of RISC-V's vector extension, implemented in GlobalFoundries 22FDX FD-SOI technology.

Hardware Architecture

Paper
Add Code

Memory-Driven Mixed Low Precision Quantization For Enabling Deep Network Inference On Microcontrollers

2 code implementations • 30 May 2019 • Manuele Rusci, Alessandro Capotondi, Luca Benini

To fit the memory and computational limitations of resource-constrained edge-devices, we exploit mixed low-bitwidth compression, featuring 8, 4 or 2-bit uniform quantization, and we model the inference graph with integer-only operations.

Quantization

Paper
Code

Additive Noise Annealing and Approximation Properties of Quantized Neural Networks

1 code implementation • 24 May 2019 • Matteo Spallanzani, Lukas Cavigelli, Gian Paolo Leonardi, Marko Bertogna, Luca Benini

We present a theoretical and experimental investigation of the quantization problem for artificial neural networks.

Image Classification Quantization

Paper
Code

An Open Source and Open Hardware Deep Learning-powered Visual Navigation Engine for Autonomous Nano-UAVs

2 code implementations • 10 May 2019 • Daniele Palossi, Francesco Conti, Luca Benini

Nano-size unmanned aerial vehicles (UAVs), with few centimeters of diameter and sub-10 Watts of total power budget, have so far been considered incapable of running sophisticated visual-based autonomous navigation software without external aid from base-stations, ad-hoc local positioning infrastructure, and powerful external computation servers.

Autonomous Navigation Visual Navigation

472

Paper
Code

Online Anomaly Detection in HPC Systems

1 code implementation • 22 Feb 2019 • Andrea Borghesi, Antonio Libri, Luca Benini, Andrea Bartolini

Reliability is a cumbersome problem in High Performance Computing Systems and Data Centers evolution.

Distributed, Parallel, and Cluster Computing

Paper
Code

Optimally Scheduling CNN Convolutions for Efficient Memory Access

no code implementations • 4 Feb 2019 • Arthur Stoutchinin, Francesco Conti, Luca Benini

Embedded inference engines for convolutional networks must be parsimonious in memory bandwidth and buffer sizing to meet power and cost constraints.

Scheduling

Paper
Add Code

Bonseyes AI Pipeline -- bringing AI to you. End-to-end integration of data, algorithms and deployment tools

no code implementations • 15 Jan 2019 • Miguel de Prado, Jing Su, Rabia Saeed, Lorenzo Keller, Noelia Vallez, Andrew Anderson, David Gregg, Luca Benini, Tim Llewellynn, Nabil Ouerhani, Rozenn Dahyot and, Nuria Pazos

In this work, we present a modular AI pipeline as an integrating framework to bring data, algorithms, and deployment tools together.

Automatic Speech Recognition (ASR) Image Classification +3

Paper
Add Code

Analysis of Contraction Effort Level in EMG-Based Gesture Recognition Using Hyperdimensional Computing

no code implementations • 2 Jan 2019 • Ali Moin, Andy Zhou, Simone Benatti, Abbas Rahimi, Luca Benini, Jan M. Rabaey

Varying contraction levels of muscles is a big challenge in electromyography-based gesture recognition.

General Classification Hand Gesture Recognition +1

Paper
Add Code

Exploring Embedding Methods in Binary Hyperdimensional Computing: A Case Study for Motor-Imagery based Brain-Computer Interfaces

1 code implementation • 13 Dec 2018 • Michael Hersche, José del R. Millán, Luca Benini, Abbas Rahimi

All these methods, differing in complexity, aim to represent EEG signals in binary HD space, e. g. with 10, 000 bits.

EEG Motor Imagery +1

Paper
Code

Learning to infer: RL-based search for DNN primitive selection on Heterogeneous Embedded Systems

no code implementations • 18 Nov 2018 • Miguel de Prado, Nuria Pazos, Luca Benini

In this work, we present QS-DNN, a fully automatic search based on Reinforcement Learning which, combined with an inference engine optimizer, efficiently explores through the design space and empirically finds the optimal combinations of libraries and primitives to speed up the inference of CNNs on heterogeneous embedded devices.

Paper
Add Code

QUENN: QUantization Engine for low-power Neural Networks

no code implementations • 14 Nov 2018 • Miguel de Prado, Maurizio Denna, Luca Benini, Nuria Pazos

Deep Learning is moving to edge devices, ushering in a new age of distributed Artificial Intelligence (AI).

Clustering Quantization

Paper
Add Code

Anomaly Detection using Autoencoders in High Performance Computing Systems

5 code implementations • 13 Nov 2018 • Andrea Borghesi, Andrea Bartolini, Michele Lombardi, Michela Milano, Luca Benini

Anomaly detection in supercomputers is a very difficult problem due to the big scale of the systems and the high number of components.

Anomaly Detection Vocal Bursts Intensity Prediction

1,225

Paper
Code

Robust identification of thermal models for in-production High-Performance-Computing clusters with machine learning-based data selection

no code implementations • 3 Oct 2018 • Federico Pittino, Roberto Diversi, Luca Benini, Andrea Bartolini

However, we also show that: 1) not all real workloads allow for the identification of a good model; 2) starting from the theory of system identification it is very difficult to evaluate if a trace of data leads to a good estimated model.

Management Quantization

Paper
Add Code

Extended Bit-Plane Compression for Convolutional Neural Network Accelerators

1 code implementation • 1 Oct 2018 • Lukas Cavigelli, Luca Benini

After the tremendous success of convolutional neural networks in image classification, object detection, speech recognition, etc., there is now rising demand for deployment of these compute-intensive ML models on tightly power constrained embedded and mobile systems at low cost as well as for pushing the throughput in data centers.

Image Classification object-detection +3

Paper
Code

One-shot Learning for iEEG Seizure Detection Using End-to-end Binary Operations: Local Binary Patterns with Hyperdimensional Computing

no code implementations • 6 Sep 2018 • Alessio Burrello, Kaspar Schindler, Luca Benini, Abbas Rahimi

This paper presents an efficient binarized algorithm for both learning and classification of human epileptic seizures from intracranial electroencephalography (iEEG).

One-Shot Learning Seizure Detection +3

Paper
Add Code

CBinfer: Exploiting Frame-to-Frame Locality for Faster Convolutional Network Inference on Video Streams

2 code implementations • 15 Aug 2018 • Lukas Cavigelli, Luca Benini

The last few years have brought advances in computer vision at an amazing pace, grounded on new findings in deep neural network construction and training as well as the availability of large labeled datasets.

object-detection Object Detection +1

Paper
Code

Hardware Optimizations of Dense Binary Hyperdimensional Computing: Rematerialization of Hypervectors, Binarized Bundling, and Combinational Associative Memory

1 code implementation • 20 Jul 2018 • Manuel Schmuck, Luca Benini, Abbas Rahimi

In this paper, we propose hardware techniques for optimizations of HD computing, in a synthesizable VHDL library, to enable co-located implementation of both learning and classification tasks on only a small portion of Xilinx(R) UltraScale(TM) FPGAs: (1) We propose simple logical operations to rematerialize the hypervectors on the fly rather than loading them from memory.

Paper
Code

XNOR Neural Engine: a Hardware Accelerator IP for 21.6 fJ/op Binary Neural Network Inference

1 code implementation • 9 Jul 2018 • Francesco Conti, Pasquale Davide Schiavone, Luca Benini

Binary Neural Networks (BNNs) are promising to deliver accuracy comparable to conventional deep neural networks at a fraction of the cost in terms of memory and energy.

Paper
Code

COUNTDOWN - three, two, one, low power! A Run-time Library for Energy Saving in MPI Communication Primitives

1 code implementation • 19 Jun 2018 • Daniele Cesarini, Andrea Bartolini, Pietro Bonfà, Carlo Cavazzoni, Luca Benini

Power consumption is a looming treat in today's computing progress.

Distributed, Parallel, and Cluster Computing

Paper
Code

Fast and Accurate Multiclass Inference for MI-BCIs Using Large Multiscale Temporal and Spectral Features

2 code implementations • 18 Jun 2018 • Michael Hersche, Tino Rellstab, Pasquale Davide Schiavone, Lukas Cavigelli, Luca Benini, Abbas Rahimi

Accurate, fast, and reliable multiclass classification of electroencephalography (EEG) signals is a challenging task towards the development of motor imagery brain-computer interface (MI-BCI) systems.

Classification EEG +1

Paper
Code

A 64mW DNN-based Visual Navigation Engine for Autonomous Nano-Drones

3 code implementations • 4 May 2018 • Daniele Palossi, Antonio Loquercio, Francesco Conti, Eric Flamand, Davide Scaramuzza, Luca Benini

As part of our general methodology we discuss the software mapping techniques that enable the state-of-the-art deep convolutional neural network presented in [1] to be fully executed on-board within a strict 6 fps real-time constraint with no compromise in terms of flight results, while all processing is done with only 64 mW on average.

Autonomous Navigation Visual Navigation

472

Paper
Code

Efficient Image Dataset Classification Difficulty Estimation for Predicting Deep-Learning Accuracy

1 code implementation • 26 Mar 2018 • Florian Scheidegger, Roxana Istrate, Giovanni Mariani, Luca Benini, Costas Bekas, Cristiano Malossi

In the deep-learning community new algorithms are published at an incredible pace.

General Classification Image Classification

Paper
Code

XNORBIN: A 95 TOp/s/W Hardware Accelerator for Binary Convolutional Neural Networks

no code implementations • 5 Mar 2018 • Andrawes Al Bahou, Geethan Karunaratne, Renzo Andri, Lukas Cavigelli, Luca Benini

Deploying state-of-the-art CNNs requires power-hungry processors and off-chip memory.

Quantization

Paper
Add Code

Hyperdrive: A Multi-Chip Systolically Scalable Binary-Weight CNN Inference Engine

no code implementations • 5 Mar 2018 • Renzo Andri, Lukas Cavigelli, Davide Rossi, Luca Benini

Deep neural networks have achieved impressive results in computer vision and machine learning.

Quantization

Paper
Add Code

An EMG Gesture Recognition System with Flexible High-Density Sensors and Brain-Inspired High-Dimensional Classifier

1 code implementation • 28 Feb 2018 • Ali Moin, Andy Zhou, Abbas Rahimi, Simone Benatti, Alisha Menon, Senam Tamakloe, Jonathan Ting, Natasha Yamamoto, Yasser Khan, Fred Burghardt, Luca Benini, Ana C. Arias, Jan M. Rabaey

We present an end-to-end system combating this variability using a large-area, high-density sensor array and a robust classification algorithm.

EMG Gesture Recognition General Classification +4

Paper
Code

A Scalable Near-Memory Architecture for Training Deep Neural Networks on Large In-Memory Datasets

no code implementations • 19 Feb 2018 • Fabian Schuiki, Michael Schaffner, Frank K. Gürkaynak, Luca Benini

Most investigations into near-memory hardware accelerators for deep neural networks have primarily focused on inference, while the potential of accelerating training has received relatively little attention so far.

Distributed, Parallel, and Cluster Computing Hardware Architecture

Paper
Add Code

HERO: Heterogeneous Embedded Research Platform for Exploring RISC-V Manycore Accelerators on FPGA

2 code implementations • 18 Dec 2017 • Andreas Kurth, Pirmin Vogel, Alessandro Capotondi, Andrea Marongiu, Luca Benini

Heterogeneous embedded systems on chip (HESoCs) co-integrate a standard host processor with programmable manycore accelerators (PMCAs) to combine general-purpose computing with domain-specific, efficient processing capabilities.

Hardware Architecture Distributed, Parallel, and Cluster Computing

Paper
Code

NEURAghe: Exploiting CPU-FPGA Synergies for Efficient and Flexible CNN Inference Acceleration on Zynq SoCs

no code implementations • 4 Dec 2017 • Paolo Meloni, Alessandro Capotondi, Gianfranco Deriu, Michele Brian, Francesco Conti, Davide Rossi, Luigi Raffo, Luca Benini

Deep convolutional neural networks (CNNs) obtain outstanding results in tasks that require human-level understanding of data, like image or speech recognition.

speech-recognition Speech Recognition

Paper
Add Code

Design Automation for Binarized Neural Networks: A Quantum Leap Opportunity?

no code implementations • 21 Nov 2017 • Manuele Rusci, Lukas Cavigelli, Luca Benini

Design automation in general, and in particular logic synthesis, can play a key role in enabling the design of application-specific Binarized Neural Networks (BNN).

Paper
Add Code

Chipmunk: A Systolically Scalable 0.9 mm${}^2$, 3.08 Gop/s/mW @ 1.2 mW Accelerator for Near-Sensor Recurrent Neural Network Inference

no code implementations • 15 Nov 2017 • Francesco Conti, Lukas Cavigelli, Gianna Paulin, Igor Susmelj, Luca Benini

Recurrent neural networks (RNNs) are state-of-the-art in voice awareness/understanding and speech recognition.

speech-recognition Speech Recognition

Paper
Add Code

CBinfer: Change-Based Inference for Convolutional Neural Networks on Video Data

1 code implementation • 14 Apr 2017 • Lukas Cavigelli, Philippe Degen, Luca Benini

Extracting per-frame features using convolutional neural networks for real-time processing of video data is currently mainly performed on powerful GPU-accelerated workstations and compute clusters.

Paper
Code

Soft-to-Hard Vector Quantization for End-to-End Learning Compressible Representations

no code implementations • NeurIPS 2017 • Eirikur Agustsson, Fabian Mentzer, Michael Tschannen, Lukas Cavigelli, Radu Timofte, Luca Benini, Luc van Gool

We present a new approach to learn compressible representations in deep architectures with an end-to-end training strategy.

Image Compression Neural Network Compression +1

Paper
Add Code

Neurostream: Scalable and Energy Efficient Deep Learning with Smart Memory Cubes

no code implementations • 23 Jan 2017 • Erfan Azarkhish, Davide Rossi, Igor Loi, Luca Benini

Our codesign approach consists of a network of Smart Memory Cubes (modular extensions to the standard HMC) each augmented with a many-core PIM platform called NeuroCluster.

Hardware Architecture Emerging Technologies

Paper
Add Code

An IoT Endpoint System-on-Chip for Secure and Energy-Efficient Near-Sensor Analytics

4 code implementations • 18 Dec 2016 • Francesco Conti, Robert Schilling, Pasquale Davide Schiavone, Antonio Pullini, Davide Rossi, Frank Kagan Gürkaynak, Michael Muehlberghuber, Michael Gautschi, Igor Loi, Germain Haugou, Stefan Mangard, Luca Benini

Near-sensor data analytics is a promising direction for IoT endpoints, as it minimizes energy spent on communication and reduces network load - but it also poses security concerns, as valuable data is stored or sent over the network at various stages of the analytics pipeline.

EEG Face Detection +1

408

Paper
Code

CAS-CNN: A Deep Convolutional Neural Network for Image Compression Artifact Suppression

1 code implementation • 22 Nov 2016 • Lukas Cavigelli, Pascal Hager, Luca Benini

Lossy image compression algorithms are pervasively used to reduce the size of images transmitted over the web and recorded on data storage media.

Image Compression

Paper
Code

Computationally Efficient Target Classification in Multispectral Image Data with Deep Neural Networks

no code implementations • 9 Nov 2016 • Lukas Cavigelli, Dominic Bernath, Michele Magno, Luca Benini

The required communication links and archiving of the video data are still expensive and this setup excludes preemptive actions to respond to imminent threats.

General Classification Scene Labeling

Paper
Add Code

Deep Structured Features for Semantic Segmentation

no code implementations • 26 Sep 2016 • Michael Tschannen, Lukas Cavigelli, Fabian Mentzer, Thomas Wiatowski, Luca Benini

We propose a highly structured neural network architecture for semantic segmentation with an extremely small model size, suitable for low-power embedded and mobile platforms.

General Classification Segmentation +1

Paper
Add Code

YodaNN: An Architecture for Ultra-Low Power Binary-Weight CNN Acceleration

no code implementations • 17 Jun 2016 • Renzo Andri, Lukas Cavigelli, Davide Rossi, Luca Benini

Convolutional neural networks (CNNs) have revolutionized the world of computer vision over the last few years, pushing image classification beyond human accuracy.

General Classification Image Classification

Paper
Add Code

Origami: A 803 GOp/s/W Convolutional Network Accelerator

no code implementations • 14 Dec 2015 • Lukas Cavigelli, Luca Benini

An ever increasing number of computer vision and image/video processing challenges are being approached using deep convolutional neural networks, obtaining state-of-the-art results in object recognition and detection, semantic segmentation, action recognition, optical flow and superresolution.

Action Recognition Object Recognition +3

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.