no code implementations • 28 Aug 2022 • Martin Molan, Andrea Borghesi, Daniele Cesarini, Luca Benini, Andrea Bartolini
However, current state-of-the-art (SoA) approaches to anomaly detection are supervised and semi-supervised, so they require a human-labelled dataset with anomalies - this is often impractical to collect in production HPC systems.
no code implementations • 20 May 2022 • Stefano Teso, Laurens Bliek, Andrea Borghesi, Michele Lombardi, Neil Yorke-Smith, Tias Guns, Andrea Passerini
The challenge is to learn them from available data, while taking into account a set of hard constraints that a solution must satisfy, and that solving the optimisation problem (esp.
no code implementations • 3 Mar 2021 • Federico Baldo, Lorenzo Dall'Olio, Mattia Ceccarelli, Riccardo Scheda, Michele Lombardi, Andrea Borghesi, Stefano Diciotti, Michela Milano
The advent of the coronavirus pandemic has sparked the interest in predictive models capable of forecasting virus-spreading, especially for boosting and supporting decision-making processes.
no code implementations • 27 Jul 2020 • Alessio Netti, Zeynep Kiziltan, Ozalp Babaoglu, Alina Sirbu, Andrea Bartolini, Andrea Borghesi
We introduce a high-level, easy-to-use fault injection tool called FINJ, with a focus on the management of complex experiments.
2 code implementations • 8 Jun 2020 • Alberto Signoroni, Mattia Savardi, Sergio Benini, Nicola Adami, Riccardo Leonardi, Paolo Gibellini, Filippo Vaccher, Marco Ravanelli, Andrea Borghesi, Roberto Maroldi, Davide Farina
In this work we design an end-to-end deep learning architecture for predicting, on Chest X-rays images (CXR), a multi-regional score conveying the degree of lung compromise in COVID-19 patients.
no code implementations • 20 May 2020 • Michele Lombardi, Federico Baldo, Andrea Borghesi, Michela Milano
Regularization-based approaches for injecting constraints in Machine Learning (ML) were introduced to improve a predictive model via expert knowledge.
no code implementations • 19 May 2020 • Andrea Borghesi, Federico Baldo, Michela Milano
Deep Learning (DL) models proved themselves to perform extremely well on a wide variety of learning tasks, as they can learn useful patterns from large data sets.
1 code implementation • 24 Feb 2020 • Andrea Borghesi, Federico Baldo, Michele Lombardi, Michela Milano
Machine Learning (ML) models are very effective in many learning tasks, due to the capability to extract meaningful information from large data sets.
2 code implementations • 24 Feb 2020 • Andrea Borghesi, Giuseppe Tagliavini, Michele Lombardi, Luca Benini, Michela Milano
The ML model learns the relation between variables precision and the output error; this information is then embedded in the MP focused on minimizing the number of bits.
Distributed, Parallel, and Cluster Computing
1 code implementation • 22 Feb 2019 • Andrea Borghesi, Antonio Libri, Luca Benini, Andrea Bartolini
Reliability is a cumbersome problem in High Performance Computing Systems and Data Centers evolution.
Distributed, Parallel, and Cluster Computing
5 code implementations • 13 Nov 2018 • Andrea Borghesi, Andrea Bartolini, Michele Lombardi, Michela Milano, Luca Benini
Anomaly detection in supercomputers is a very difficult problem due to the big scale of the systems and the high number of components.
no code implementations • 26 Oct 2018 • Alessio Netti, Zeynep Kiziltan, Ozalp Babaoglu, Alina Sirbu, Andrea Bartolini, Andrea Borghesi
As High-Performance Computing (HPC) systems strive towards the exascale goal, studies suggest that they will experience excessive failure rates.
Distributed, Parallel, and Cluster Computing
1 code implementation • 26 Jul 2018 • Alessio Netti, Zeynep Kiziltan, Ozalp Babaoglu, Alina Sirbu, Andrea Bartolini, Andrea Borghesi
We present FINJ, a high-level fault injection tool for High-Performance Computing (HPC) systems, with a focus on the management of complex experiments.
Distributed, Parallel, and Cluster Computing