Search Results for author: James Laudon

Found 12 papers, 3 papers with code

GiPH: Generalizable Placement Learning for Adaptive Heterogeneous Computing

1 code implementation23 May 2023 Yi Hu, Chaoran Zhang, Edward Andert, Harshul Singh, Aviral Shrivastava, James Laudon, Yanqi Zhou, Bob Iannucci, Carlee Joe-Wong

Careful placement of a computational application within a target device cluster is critical for achieving low application completion time.

Edge-computing

Lifelong Language Pretraining with Distribution-Specialized Experts

no code implementations20 May 2023 Wuyang Chen, Yanqi Zhou, Nan Du, Yanping Huang, James Laudon, Zhifeng Chen, Claire Cu

Compared to existing lifelong learning approaches, Lifelong-MoE achieves better few-shot performance on 19 downstream NLP tasks.

Mixture-of-Experts with Expert Choice Routing

no code implementations18 Feb 2022 Yanqi Zhou, Tao Lei, Hanxiao Liu, Nan Du, Yanping Huang, Vincent Zhao, Andrew Dai, Zhifeng Chen, Quoc Le, James Laudon

Prior work allocates a fixed number of experts to each token using a top-k function regardless of the relative importance of different tokens.

A Transferable Approach for Partitioning Machine Learning Models on Multi-Chip-Modules

no code implementations7 Dec 2021 Xinfeng Xie, Prakash Prabhu, Ulysse Beaugnon, Phitchaya Mangpo Phothilimthana, Sudip Roy, Azalia Mirhoseini, Eugene Brevdo, James Laudon, Yanqi Zhou

Partitioning ML graphs for MCMs is particularly hard as the search space grows exponentially with the number of chiplets available and the number of nodes in the neural network.

BIG-bench Machine Learning Reinforcement Learning (RL) +1

An Evaluation of Edge TPU Accelerators for Convolutional Neural Networks

1 code implementation20 Feb 2021 Kiran Seshadri, Berkin Akin, James Laudon, Ravi Narayanaswami, Amir Yazdanbakhsh

Then, we extensively evaluate three classes of Edge TPUs, covering different computing ecosystems, that are either currently deployed in Google products or are the product pipeline, across 423K unique convolutional neural networks.

Apollo: Transferable Architecture Exploration

no code implementations2 Feb 2021 Amir Yazdanbakhsh, Christof Angermueller, Berkin Akin, Yanqi Zhou, Albin Jones, Milad Hashemi, Kevin Swersky, Satrajit Chatterjee, Ravi Narayanaswami, James Laudon

We further show that by transferring knowledge between target architectures with different design constraints, Apollo is able to find optimal configurations faster and often with better objective value (up to 25% improvements).

Transferable Graph Optimizers for ML Compilers

no code implementations NeurIPS 2020 Yanqi Zhou, Sudip Roy, Amirali Abdolrashidi, Daniel Wong, Peter Ma, Qiumin Xu, Hanxiao Liu, Phitchaya Mangpo Phothilimthana, Shen Wang, Anna Goldie, Azalia Mirhoseini, James Laudon

Most compilers for machine learning (ML) frameworks need to solve many correlated optimization problems to generate efficient machine code.

GDP: Generalized Device Placement for Dataflow Graphs

no code implementations28 Sep 2019 Yanqi Zhou, Sudip Roy, Amirali Abdolrashidi, Daniel Wong, Peter C. Ma, Qiumin Xu, Ming Zhong, Hanxiao Liu, Anna Goldie, Azalia Mirhoseini, James Laudon

Runtime and scalability of large neural networks can be significantly affected by the placement of operations in their dataflow graphs on suitable devices.

Cannot find the paper you are looking for? You can Submit a new open access paper.