Search Results for author: James Laudon

Found 12 papers, 3 papers with code

GiPH: Generalizable Placement Learning for Adaptive Heterogeneous Computing

1 code implementation • 23 May 2023 • Yi Hu, Chaoran Zhang, Edward Andert, Harshul Singh, Aviral Shrivastava, James Laudon, Yanqi Zhou, Bob Iannucci, Carlee Joe-Wong

Careful placement of a computational application within a target device cluster is critical for achieving low application completion time.

Edge-computing

Paper
Code

Lifelong Language Pretraining with Distribution-Specialized Experts

no code implementations • 20 May 2023 • Wuyang Chen, Yanqi Zhou, Nan Du, Yanping Huang, James Laudon, Zhifeng Chen, Claire Cu

Compared to existing lifelong learning approaches, Lifelong-MoE achieves better few-shot performance on 19 downstream NLP tasks.

Paper
Add Code

Mixture-of-Experts with Expert Choice Routing

no code implementations • 18 Feb 2022 • Yanqi Zhou, Tao Lei, Hanxiao Liu, Nan Du, Yanping Huang, Vincent Zhao, Andrew Dai, Zhifeng Chen, Quoc Le, James Laudon

Prior work allocates a fixed number of experts to each token using a top-k function regardless of the relative importance of different tokens.

Paper
Add Code

A Transferable Approach for Partitioning Machine Learning Models on Multi-Chip-Modules

no code implementations • 7 Dec 2021 • Xinfeng Xie, Prakash Prabhu, Ulysse Beaugnon, Phitchaya Mangpo Phothilimthana, Sudip Roy, Azalia Mirhoseini, Eugene Brevdo, James Laudon, Yanqi Zhou

Partitioning ML graphs for MCMs is particularly hard as the search space grows exponentially with the number of chiplets available and the number of nodes in the neural network.

BIG-bench Machine Learning Reinforcement Learning (RL) +1

Paper
Add Code

An Evaluation of Edge TPU Accelerators for Convolutional Neural Networks

1 code implementation • 20 Feb 2021 • Kiran Seshadri, Berkin Akin, James Laudon, Ravi Narayanaswami, Amir Yazdanbakhsh

Then, we extensively evaluate three classes of Edge TPUs, covering different computing ecosystems, that are either currently deployed in Google products or are the product pipeline, across 423K unique convolutional neural networks.

32,745

Paper
Code

Rethinking Co-design of Neural Architectures and Hardware Accelerators

no code implementations • 17 Feb 2021 • Yanqi Zhou, Xuanyi Dong, Berkin Akin, Mingxing Tan, Daiyi Peng, Tianjian Meng, Amir Yazdanbakhsh, Da Huang, Ravi Narayanaswami, James Laudon

In our work, we target the optimization of hardware and software configurations on an industry-standard edge accelerator.

Neural Architecture Search

Paper
Add Code

Apollo: Transferable Architecture Exploration

no code implementations • 2 Feb 2021 • Amir Yazdanbakhsh, Christof Angermueller, Berkin Akin, Yanqi Zhou, Albin Jones, Milad Hashemi, Kevin Swersky, Satrajit Chatterjee, Ravi Narayanaswami, James Laudon

We further show that by transferring knowledge between target architectures with different design constraints, Apollo is able to find optimal configurations faster and often with better objective value (up to 25% improvements).

Paper
Add Code

NAHAS: Neural Architecture and Hardware Accelerator Search

no code implementations • 1 Jan 2021 • Yanqi Zhou, Xuanyi Dong, Daiyi Peng, Ethan Zhu, Amir Yazdanbakhsh, Berkin Akin, Mingxing Tan, James Laudon

In this paper, we study the importance of co-designing neural architectures and hardware accelerators.

Neural Architecture Search

Paper
Add Code

Transferable Graph Optimizers for ML Compilers

no code implementations • NeurIPS 2020 • Yanqi Zhou, Sudip Roy, Amirali Abdolrashidi, Daniel Wong, Peter Ma, Qiumin Xu, Hanxiao Liu, Phitchaya Mangpo Phothilimthana, Shen Wang, Anna Goldie, Azalia Mirhoseini, James Laudon

Most compilers for machine learning (ML) frameworks need to solve many correlated optimization problems to generate efficient machine code.

Paper
Add Code

Chip Placement with Deep Reinforcement Learning

2 code implementations • 22 Apr 2020 • Azalia Mirhoseini, Anna Goldie, Mustafa Yazgan, Joe Jiang, Ebrahim Songhori, Shen Wang, Young-Joon Lee, Eric Johnson, Omkar Pathak, Sungmin Bae, Azade Nazi, Jiwoo Pak, Andy Tong, Kavya Srinivasa, William Hang, Emre Tuncer, Anand Babu, Quoc V. Le, James Laudon, Richard Ho, Roger Carpenter, Jeff Dean

To achieve these results, we pose placement as a Reinforcement Learning (RL) problem and train an agent to place the nodes of a chip netlist onto a chip canvas.

reinforcement-learning Reinforcement Learning (RL) +2

Paper
Code

GDP: Generalized Device Placement for Dataflow Graphs

no code implementations • 28 Sep 2019 • Yanqi Zhou, Sudip Roy, Amirali Abdolrashidi, Daniel Wong, Peter C. Ma, Qiumin Xu, Ming Zhong, Hanxiao Liu, Anna Goldie, Azalia Mirhoseini, James Laudon

Runtime and scalability of large neural networks can be significantly affected by the placement of operations in their dataflow graphs on suitable devices.

Paper
Add Code

In-Datacenter Performance Analysis of a Tensor Processing Unit

no code implementations • 16 Apr 2017 • Norman P. Jouppi, Cliff Young, Nishant Patil, David Patterson, Gaurav Agrawal, Raminder Bajwa, Sarah Bates, Suresh Bhatia, Nan Boden, Al Borchers, Rick Boyle, Pierre-luc Cantin, Clifford Chao, Chris Clark, Jeremy Coriell, Mike Daley, Matt Dau, Jeffrey Dean, Ben Gelb, Tara Vazir Ghaemmaghami, Rajendra Gottipati, William Gulland, Robert Hagmann, C. Richard Ho, Doug Hogberg, John Hu, Robert Hundt, Dan Hurt, Julian Ibarz, Aaron Jaffey, Alek Jaworski, Alexander Kaplan, Harshit Khaitan, Andy Koch, Naveen Kumar, Steve Lacy, James Laudon, James Law, Diemthu Le, Chris Leary, Zhuyuan Liu, Kyle Lucke, Alan Lundin, Gordon MacKean, Adriana Maggiore, Maire Mahony, Kieran Miller, Rahul Nagarajan, Ravi Narayanaswami, Ray Ni, Kathy Nix, Thomas Norrie, Mark Omernick, Narayana Penukonda, Andy Phelps, Jonathan Ross, Matt Ross, Amir Salek, Emad Samadiani, Chris Severn, Gregory Sizikov, Matthew Snelham, Jed Souter, Dan Steinberg, Andy Swing, Mercedes Tan, Gregory Thorson, Bo Tian, Horia Toma, Erick Tuttle, Vijay Vasudevan, Richard Walter, Walter Wang, Eric Wilcox, Doe Hyun Yoon

Our workload, written in the high-level TensorFlow framework, uses production NN applications (MLPs, CNNs, and LSTMs) that represent 95% of our datacenters' NN inference demand.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.