no code implementations • 7 Mar 2024 • Febin Sunny, Ebadollah Taheri, Mahdi Nikdast, Sudeep Pasricha
Modern machine learning (ML) applications are becoming increasingly complex and monolithic (single chip) accelerator architectures cannot keep up with their energy efficiency and throughput demands.
no code implementations • 28 Jan 2023 • Febin Sunny, Ebadollah Taheri, Mahdi Nikdast, Sudeep Pasricha
Domain-specific machine learning (ML) accelerators such as Google's TPU and Apple's Neural Engine now dominate CPUs and GPUs for energy-efficient ML processing.
no code implementations • 16 Feb 2021 • Ebadollah Taheri, Ryan G. Kim, Mahdi Nikdast
By lowering the number of vertical connections in fully connected 3D networks-on-chip (NoCs), partially connected 3D NoCs (PC-3DNoCs) help alleviate reliability and fabrication issues.
Distributed, Parallel, and Cluster Computing Hardware Architecture Performance