1 code implementation • 19 Jun 2023 • Wenqi Jiang, Shigang Li, Yu Zhu, Johannes De Fine Licht, Zhenhao He, Runbin Shi, Cedric Renggli, Shuai Zhang, Theodoros Rekatsinas, Torsten Hoefler, Gustavo Alonso
Vector search has emerged as the foundation for large-scale information retrieval and machine learning systems, with search engines like Google and Bing processing tens of thousands of queries per second on petabyte-scale document datasets by evaluating vector similarities between encoded query texts and web documents.
2 code implementations • 10 Oct 2019 • Johannes de Fine Licht, Torsten Hoefler
High-level synthesis (HLS) tools have brought FPGA development into the mainstream, by allowing programmers to design architectures using familiar languages such as C, C++, and OpenCL.
Hardware Architecture Distributed, Parallel, and Cluster Computing Software Engineering
1 code implementation • 18 Jul 2019 • Tiziano De Matteis, Johannes De Fine Licht, Torsten Hoefler
Spatial computing architectures pose an attractive alternative to mitigate control and data movement overheads typical of load-store architectures.
Distributed, Parallel, and Cluster Computing
3 code implementations • 27 Feb 2019 • Tal Ben-Nun, Johannes De Fine Licht, Alexandros Nikolaos Ziogas, Timo Schneider, Torsten Hoefler
With the ubiquity of accelerators, such as FPGAs and GPUs, the complexity of high-performance programming is increasing beyond the skill-set of the average scientist in domains outside of computer science.
Programming Languages Distributed, Parallel, and Cluster Computing Performance
no code implementations • 25 Feb 2019 • Maciej Besta, Dimitri Stanojevic, Johannes De Fine Licht, Tal Ben-Nun, Torsten Hoefler
To facilitate understanding of this emerging domain, we present the first survey and taxonomy on graph computations on FPGAs.
Distributed, Parallel, and Cluster Computing Hardware Architecture
2 code implementations • 21 May 2018 • Johannes de Fine Licht, Simon Meierhans, Torsten Hoefler
Specialized hardware architectures promise a major step in performance and energy efficiency over the traditional load/store devices currently employed in large scale computing systems.
Distributed, Parallel, and Cluster Computing Programming Languages I.1.3; C.1.4; D.1.3