The Ultimate DataFlow for Ultimate SuperComputers-on-a-Chip, for Scientific Computing, Geo Physics, Complex Mathematics, and Information Processing

20 Sep 2020  ·  Veljko Milutinovic, Erfan Sadeqi Azer, Kristy Yoshimoto, Gerhard Klimeck, Miljan Djordjevic, Milos Kotlar, Miroslav Bojovic, Bozidar Miladinovic, Nenad Korolija, Stevan Stankovic, Nenad Filipović, Zoran Babovic, Miroslav Kosanic, Akira Tsuda, Mateo Valero, Massimo De Santo, Erich Neuhold, Jelena Skoručak, Laura Dipietro, Ivan Ratkovic ·

This article starts from the assumption that near future 100BTransistor SuperComputers-on-a-Chip will include N big multi-core processors, 1000N small many-core processors, a TPU-like fixed-structure systolic array accelerator for the most frequently used Machine Learning algorithms needed in bandwidth-bound applications and a flexible-structure reprogrammable accelerator for less frequently used Machine Learning algorithms needed in latency-critical applications.

PDF Abstract
No code implementations yet. Submit your code now

Categories


Distributed, Parallel, and Cluster Computing

Datasets


  Add Datasets introduced or used in this paper