Search Results for author: Carole-Jean Wu

Found 47 papers, 13 papers with code

Is Flash Attention Stable?

no code implementations • 5 May 2024 • Alicia Golden, Samuel Hsia, Fei Sun, Bilge Acun, Basil Hosmer, Yejin Lee, Zachary DeVito, Jeff Johnson, Gu-Yeon Wei, David Brooks, Carole-Jean Wu

Training large-scale machine learning models poses distinct system challenges, given both the size and complexity of today's workloads.

Paper
Add Code

LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding

no code implementations • 25 Apr 2024 • Mostafa Elhoushi, Akshat Shrivastava, Diana Liskovich, Basil Hosmer, Bram Wasti, Liangzhen Lai, Anas Mahmoud, Bilge Acun, Saurabh Agarwal, Ahmed Roman, Ahmed A Aly, Beidi Chen, Carole-Jean Wu

We present LayerSkip, an end-to-end solution to speed-up inference of large language models (LLMs).

Continual Pretraining Semantic Parsing

Paper
Add Code

Introducing v0.5 of the AI Safety Benchmark from MLCommons

1 code implementation • 18 Apr 2024 • Bertie Vidgen, Adarsh Agrawal, Ahmed M. Ahmed, Victor Akinwande, Namir Al-Nuaimi, Najla Alfaraj, Elie Alhajjar, Lora Aroyo, Trupti Bavalatti, Borhane Blili-Hamelin, Kurt Bollacker, Rishi Bomassani, Marisa Ferrara Boston, Siméon Campos, Kal Chakra, Canyu Chen, Cody Coleman, Zacharie Delpierre Coudert, Leon Derczynski, Debojyoti Dutta, Ian Eisenberg, James Ezick, Heather Frase, Brian Fuller, Ram Gandikota, Agasthya Gangavarapu, Ananya Gangavarapu, James Gealy, Rajat Ghosh, James Goel, Usman Gohar, Sujata Goswami, Scott A. Hale, Wiebke Hutiri, Joseph Marvin Imperial, Surgan Jandial, Nick Judd, Felix Juefei-Xu, Foutse khomh, Bhavya Kailkhura, Hannah Rose Kirk, Kevin Klyman, Chris Knotz, Michael Kuchnik, Shachi H. Kumar, Chris Lengerich, Bo Li, Zeyi Liao, Eileen Peters Long, Victor Lu, Yifan Mai, Priyanka Mary Mammen, Kelvin Manyeki, Sean McGregor, Virendra Mehta, Shafee Mohammed, Emanuel Moss, Lama Nachman, Dinesh Jinenhally Naganna, Amin Nikanjam, Besmira Nushi, Luis Oala, Iftach Orr, Alicia Parrish, Cigdem Patlak, William Pietri, Forough Poursabzi-Sangdeh, Eleonora Presani, Fabrizio Puletti, Paul Röttger, Saurav Sahay, Tim Santos, Nino Scherrer, Alice Schoenauer Sebag, Patrick Schramowski, Abolfazl Shahbazi, Vin Sharma, Xudong Shen, Vamsi Sistla, Leonard Tang, Davide Testuggine, Vithursan Thangarasa, Elizabeth Anne Watkins, Rebecca Weiss, Chris Welty, Tyler Wilbers, Adina Williams, Carole-Jean Wu, Poonam Yadav, Xianjun Yang, Yi Zeng, Wenhui Zhang, Fedor Zhdanov, Jiacheng Zhu, Percy Liang, Peter Mattson, Joaquin Vanschoren

We created a new taxonomy of 13 hazard categories, of which 7 have tests in the v0. 5 benchmark.

Paper
Code

Croissant: A Metadata Format for ML-Ready Datasets

1 code implementation • 28 Mar 2024 • Mubashara Akhtar, Omar Benjelloun, Costanza Conforti, Joan Giner-Miguelez, Nitisha Jain, Michael Kuchnik, Quentin Lhoest, Pierre Marcenac, Manil Maskey, Peter Mattson, Luis Oala, Pierre Ruyssen, Rajat Shinde, Elena Simperl, Goeffry Thomas, Slava Tykhonov, Joaquin Vanschoren, Steffen Vogler, Carole-Jean Wu

Data is a critical resource for Machine Learning (ML), yet working with data remains a key friction point.

Friction Management

255

Paper
Code

CHAI: Clustered Head Attention for Efficient LLM Inference

no code implementations • 12 Mar 2024 • Saurabh Agarwal, Bilge Acun, Basil Hosmer, Mostafa Elhoushi, Yejin Lee, Shivaram Venkataraman, Dimitris Papailiopoulos, Carole-Jean Wu

We observe that there is a high amount of redundancy across heads on which tokens they pay attention to.

Paper
Add Code

HeteroSwitch: Characterizing and Taming System-Induced Data Heterogeneity in Federated Learning

no code implementations • 7 Mar 2024 • Gyudong Kim, Mehdi Ghasemi, Soroush Heidari, Seungryong Kim, Young Geun Kim, Sarma Vrudhula, Carole-Jean Wu

Such fragmentation introduces a new type of data heterogeneity in FL, namely \textit{system-induced data heterogeneity}, as each device generates distinct data depending on its hardware and software configurations.

Domain Generalization Fairness +1

Paper
Add Code

Generative AI Beyond LLMs: System Implications of Multi-Modal Generation

no code implementations • 22 Dec 2023 • Alicia Golden, Samuel Hsia, Fei Sun, Bilge Acun, Basil Hosmer, Yejin Lee, Zachary DeVito, Jeff Johnson, Gu-Yeon Wei, David Brooks, Carole-Jean Wu

As the development of large-scale Generative AI models evolve beyond text (1D) generation to include image (2D) and video (3D) generation, processing spatial and temporal information presents unique challenges to quality, performance, and efficiency.

3D Generation

Paper
Add Code

Decoding Data Quality via Synthetic Corruptions: Embedding-guided Pruning of Code Data

no code implementations • 5 Dec 2023 • Yu Yang, Aaditya K. Singh, Mostafa Elhoushi, Anas Mahmoud, Kushal Tirumala, Fabian Gloeckle, Baptiste Rozière, Carole-Jean Wu, Ari S. Morcos, Newsha Ardalani

Armed with this knowledge, we devise novel pruning metrics that operate in embedding space to identify and remove low-quality entries in the Stack dataset.

Code Generation

Paper
Add Code

Data Acquisition: A New Frontier in Data-centric AI

no code implementations • 22 Nov 2023 • Lingjiao Chen, Bilge Acun, Newsha Ardalani, Yifan Sun, Feiyang Kang, Hanrui Lyu, Yongchan Kwon, Ruoxi Jia, Carole-Jean Wu, Matei Zaharia, James Zou

As Machine Learning (ML) systems continue to grow, the demand for relevant and comprehensive datasets becomes imperative.

Paper
Add Code

GEVO-ML: Optimizing Machine Learning Code with Evolutionary Computation

no code implementations • 16 Oct 2023 • Jhe-Yu Liou, Stephanie Forrest, Carole-Jean Wu

For the training workloads, GEVO-ML finds a 4. 88% improvement in model accuracy, from 91% to 96%, without sacrificing training or testing speed.

Paper
Add Code

MAD Max Beyond Single-Node: Enabling Large Machine Learning Model Acceleration on Distributed Systems

no code implementations • 4 Oct 2023 • Samuel Hsia, Alicia Golden, Bilge Acun, Newsha Ardalani, Zachary DeVito, Gu-Yeon Wei, David Brooks, Carole-Jean Wu

Training and deploying large machine learning (ML) models is time-consuming and requires significant distributed computing infrastructures.

Distributed Computing

Paper
Add Code

READ: Recurrent Adaptation of Large Transformers

no code implementations • 24 May 2023 • Sid Wang, John Nguyen, Ke Li, Carole-Jean Wu

However, fine-tuning all pre-trained model parameters becomes impractical as the model size and number of tasks increase.

Transfer Learning

Paper
Add Code

Green Federated Learning

no code implementations • 26 Mar 2023 • Ashkan Yousefpour, Shen Guo, Ashish Shenoy, Sayan Ghosh, Pierre Stock, Kiwan Maeng, Schalk-Willem Krüger, Michael Rabbat, Carole-Jean Wu, Ilya Mironov

The rapid progress of AI is fueled by increasingly large and computationally intensive machine learning models and datasets.

Federated Learning

Paper
Add Code

Towards MoE Deployment: Mitigating Inefficiencies in Mixture-of-Expert (MoE) Inference

no code implementations • 10 Mar 2023 • Haiyang Huang, Newsha Ardalani, Anna Sun, Liu Ke, Hsien-Hsin S. Lee, Anjali Sridhar, Shruti Bhosale, Carole-Jean Wu, Benjamin Lee

We propose three optimization techniques to mitigate sources of inefficiencies, namely (1) Dynamic gating, (2) Expert Buffering, and (3) Expert load balancing.

Decoder Language Modelling +1

Paper
Add Code

MP-Rec: Hardware-Software Co-Design to Enable Multi-Path Recommendation

no code implementations • 21 Feb 2023 • Samuel Hsia, Udit Gupta, Bilge Acun, Newsha Ardalani, Pan Zhong, Gu-Yeon Wei, David Brooks, Carole-Jean Wu

Based on our characterization of various embedding representations, we propose a hybrid embedding representation that achieves higher quality embeddings at the cost of increased memory and compute requirements.

Recommendation Systems

Paper
Add Code

FlexShard: Flexible Sharding for Industry-Scale Sequence Recommendation Models

no code implementations • 8 Jan 2023 • Geet Sethi, Pallab Bhattacharya, Dhruv Choudhary, Carole-Jean Wu, Christos Kozyrakis

Sequence-based deep learning recommendation models (DLRMs) are an emerging class of DLRMs showing great improvements over their prior sum-pooling based counterparts at capturing users' long term interests.

Paper
Add Code

FedGPO: Heterogeneity-Aware Global Parameter Optimization for Efficient Federated Learning

no code implementations • 30 Nov 2022 • Young Geun Kim, Carole-Jean Wu

Federated learning (FL) has emerged as a solution to deal with the risk of privacy leaks in machine learning training.

Federated Learning

Paper
Add Code

RecD: Deduplication for End-to-End Deep Learning Recommendation Model Training Infrastructure

no code implementations • 9 Nov 2022 • Mark Zhao, Dhruv Choudhary, Devashish Tyagi, Ajay Somani, Max Kaplan, Sung-Han Lin, Sarunya Pumma, Jongsoo Park, Aarti Basant, Niket Agarwal, Carole-Jean Wu, Christos Kozyrakis

RecD addresses immense storage, preprocessing, and training overheads caused by feature duplication inherent in industry-scale DLRM training datasets.

Paper
Add Code

Understanding Scaling Laws for Recommendation Models

no code implementations • 17 Aug 2022 • Newsha Ardalani, Carole-Jean Wu, Zeliang Chen, Bhargav Bhushanam, Adnan Aziz

We show that parameter scaling is out of steam for the model architecture under study, and until a higher-performing model architecture emerges, data scaling is the path forward.

Paper
Add Code

DataPerf: Benchmarks for Data-Centric AI Development

1 code implementation • NeurIPS 2023 • Mark Mazumder, Colby Banbury, Xiaozhe Yao, Bojan Karlaš, William Gaviria Rojas, Sudnya Diamos, Greg Diamos, Lynn He, Alicia Parrish, Hannah Rose Kirk, Jessica Quaye, Charvi Rastogi, Douwe Kiela, David Jurado, David Kanter, Rafael Mosquera, Juan Ciro, Lora Aroyo, Bilge Acun, Lingjiao Chen, Mehul Smriti Raje, Max Bartolo, Sabri Eyuboglu, Amirata Ghorbani, Emmett Goodman, Oana Inel, Tariq Kane, Christine R. Kirkpatrick, Tzu-Sheng Kuo, Jonas Mueller, Tristan Thrush, Joaquin Vanschoren, Margaret Warren, Adina Williams, Serena Yeung, Newsha Ardalani, Praveen Paritosh, Lilith Bat-Leah, Ce Zhang, James Zou, Carole-Jean Wu, Cody Coleman, Andrew Ng, Peter Mattson, Vijay Janapa Reddi

Machine learning research has long focused on models rather than datasets, and prominent datasets are used for common ML tasks without regard to the breadth, difficulty, and faithfulness of the underlying problems.

Paper
Code

FEL: High Capacity Learning for Recommendation and Ranking via Federated Ensemble Learning

no code implementations • 7 Jun 2022 • Meisam Hejazinia Dzmitry Huba, Ilias Leontiadis, Kiwan Maeng, Mani Malek, Luca Melis, Ilya Mironov, Milad Nasr, Kaikai Wang, Carole-Jean Wu

Despite FL's initial success, many important deep learning use cases, such as ranking and recommendation tasks, have been limited from on-device learning.

Ensemble Learning Federated Learning +1

Paper
Add Code

Infinite Recommendation Networks: A Data-Centric Approach

5 code implementations • 3 Jun 2022 • Noveen Sachdeva, Mehak Preet Dhaliwal, Carole-Jean Wu, Julian McAuley

We leverage the Neural Tangent Kernel and its equivalence to training infinitely-wide neural networks to devise $\infty$-AE: an autoencoder with infinitely-wide bottleneck layers.

Ranked #1 on Recommendation Systems on Douban (AUC metric)

Information Retrieval Recommendation Systems

1,178

Paper
Code

Towards Fair Federated Recommendation Learning: Characterizing the Inter-Dependence of System and Data Heterogeneity

no code implementations • 30 May 2022 • Kiwan Maeng, Haiyu Lu, Luca Melis, John Nguyen, Mike Rabbat, Carole-Jean Wu

Federated learning (FL) is an effective mechanism for data privacy in recommender systems by running machine learning model training on-device.

Fairness Federated Learning +2

Paper
Add Code

RecShard: Statistical Feature-Based Memory Optimization for Industry-Scale Neural Recommendation

no code implementations • 25 Jan 2022 • Geet Sethi, Bilge Acun, Niket Agarwal, Christos Kozyrakis, Caroline Trippel, Carole-Jean Wu

EMBs exhibit distinct memory characteristics, providing performance optimization opportunities for intelligent EMB partitioning and placement across a tiered memory hierarchy.

Paper
Add Code

On Sampling Collaborative Filtering Datasets

1 code implementation • 13 Jan 2022 • Noveen Sachdeva, Carole-Jean Wu, Julian McAuley

We study the practical consequences of dataset sampling strategies on the ranking performance of recommendation algorithms.

Collaborative Filtering Recommendation Systems

Paper
Code

Papaya: Practical, Private, and Scalable Federated Learning

no code implementations • 8 Nov 2021 • Dzmitry Huba, John Nguyen, Kshitiz Malik, Ruiyu Zhu, Mike Rabbat, Ashkan Yousefpour, Carole-Jean Wu, Hongyuan Zhan, Pavel Ustinov, Harish Srinivas, Kaikai Wang, Anthony Shoumikhin, Jesik Min, Mani Malek

Our work tackles the aforementioned issues, sketches of some of the system design challenges and their solutions, and touches upon principles that emerged from building a production FL system for millions of clients.

Federated Learning

Paper
Add Code

Sustainable AI: Environmental Implications, Challenges and Opportunities

no code implementations • 30 Oct 2021 • Carole-Jean Wu, Ramya Raghavendra, Udit Gupta, Bilge Acun, Newsha Ardalani, Kiwan Maeng, Gloria Chang, Fiona Aga Behram, James Huang, Charles Bai, Michael Gschwind, Anurag Gupta, Myle Ott, Anastasia Melnikov, Salvatore Candido, David Brooks, Geeta Chauhan, Benjamin Lee, Hsien-Hsin S. Lee, Bugra Akyildiz, Maximilian Balandat, Joe Spisak, Ravi Jain, Mike Rabbat, Kim Hazelwood

This paper explores the environmental impact of the super-linear growth trends for AI from a holistic perspective, spanning Data, Algorithms, and System Hardware.

Paper
Add Code

Understanding Data Storage and Ingestion for Large-Scale Deep Recommendation Model Training

no code implementations • 20 Aug 2021 • Mark Zhao, Niket Agarwal, Aarti Basant, Bugra Gedik, Satadru Pan, Mustafa Ozdal, Rakesh Komuravelli, Jerry Pan, Tianshu Bao, Haowei Lu, Sundaram Narayanan, Jack Langman, Kevin Wilfong, Harsha Rastogi, Carole-Jean Wu, Christos Kozyrakis, Parik Pol

Innovations that improve the efficiency and performance of DSI systems and hardware are urgent, demanding a deep understanding of DSI characteristics and infrastructure at scale.

Scheduling

Paper
Add Code

AutoFL: Enabling Heterogeneity-Aware Energy Efficient Federated Learning

no code implementations • 16 Jul 2021 • Young Geun Kim, Carole-Jean Wu

Federated learning enables a cluster of decentralized mobile devices at the edge to collaboratively train a shared machine learning model, while keeping all the raw training samples on device.

Federated Learning

Paper
Add Code

SVP-CF: Selection via Proxy for Collaborative Filtering Data

no code implementations • 11 Jul 2021 • Noveen Sachdeva, Carole-Jean Wu, Julian McAuley

As we demonstrate, commonly-used data sampling schemes can have significant consequences on algorithm performance -- masking performance deficiencies in algorithms or altering the relative performance of algorithms, as compared to models trained on the complete dataset.

Collaborative Filtering Recommendation Systems

Paper
Add Code

Low-Precision Hardware Architectures Meet Recommendation Model Inference at Scale

no code implementations • 26 May 2021 • Zhaoxia, Deng, Jongsoo Park, Ping Tak Peter Tang, Haixin Liu, Jie, Yang, Hector Yuen, Jianyu Huang, Daya Khudia, Xiaohan Wei, Ellie Wen, Dhruv Choudhary, Raghuraman Krishnamoorthi, Carole-Jean Wu, Satish Nadathur, Changkyu Kim, Maxim Naumov, Sam Naghshineh, Mikhail Smelyanskiy

We share in this paper our search strategies to adapt reference recommendation models to low-precision hardware, our optimization of low-precision compute kernels, and the design and development of tool chain so as to maintain our models' accuracy throughout their lifespan during which topic trends and users' interests inevitably evolve.

Recommendation Systems

Paper
Add Code

RecPipe: Co-designing Models and Hardware to Jointly Optimize Recommendation Quality and Performance

1 code implementation • 18 May 2021 • Udit Gupta, Samuel Hsia, Jeff Zhang, Mark Wilkening, Javin Pombra, Hsien-Hsin S. Lee, Gu-Yeon Wei, Carole-Jean Wu, David Brooks

Thus, we design RecPipeAccel (RPAccel), a custom accelerator that jointly optimizes quality, tail-latency, and system throughput.

Recommendation Systems Scheduling

Paper
Code

RecSSD: Near Data Processing for Solid State Drive Based Recommendation Inference

no code implementations • 29 Jan 2021 • Mark Wilkening, Udit Gupta, Samuel Hsia, Caroline Trippel, Carole-Jean Wu, David Brooks, Gu-Yeon Wei

Neural personalized recommendation models are used across a wide variety of datacenter applications including search, social media, and entertainment.

Paper
Add Code

TT-Rec: Tensor Train Compression for Deep Learning Recommendation Models

1 code implementation • 25 Jan 2021 • Chunxing Yin, Bilge Acun, Xing Liu, Carole-Jean Wu

TT-Rec achieves 117 times and 112 times model size compression, for Kaggle and Terabyte, respectively.

192

Paper
Code

Understanding Training Efficiency of Deep Learning Recommendation Models at Scale

no code implementations • 11 Nov 2020 • Bilge Acun, Matthew Murphy, Xiaodong Wang, Jade Nie, Carole-Jean Wu, Kim Hazelwood

The use of GPUs has proliferated for machine learning workflows and is now considered mainstream for many deep learning models.

Paper
Add Code

CPR: Understanding and Improving Failure Tolerant Training for Deep Learning Recommendation with Partial Recovery

no code implementations • 5 Nov 2020 • Kiwan Maeng, Shivam Bharuka, Isabel Gao, Mark C. Jeffrey, Vikram Saraph, Bor-Yiing Su, Caroline Trippel, Jiyan Yang, Mike Rabbat, Brandon Lucia, Carole-Jean Wu

The paper is the first to the extent of our knowledge to perform a data-driven, in-depth analysis of applying partial recovery to recommendation models and identified a trade-off between accuracy and performance.

Paper
Add Code

Understanding Capacity-Driven Scale-Out Neural Recommendation Inference

no code implementations • 4 Nov 2020 • Michael Lui, Yavuz Yetim, Özgür Özkan, Zhuoran Zhao, Shin-Yeh Tsai, Carole-Jean Wu, Mark Hempstead

One approach to support this scale is with distributed serving, or distributed inference, which divides the memory requirements of a single large model across multiple servers.

Recommendation Systems

Paper
Add Code

Cross-Stack Workload Characterization of Deep Recommendation Systems

no code implementations • 10 Oct 2020 • Samuel Hsia, Udit Gupta, Mark Wilkening, Carole-Jean Wu, Gu-Yeon Wei, David Brooks

Deep learning based recommendation systems form the backbone of most personalized cloud services.

Recommendation Systems

Paper
Add Code

AutoScale: Optimizing Energy Efficiency of End-to-End Edge Inference under Stochastic Variance

no code implementations • 6 May 2020 • Young Geun Kim, Carole-Jean Wu

Such execution scaling decision becomes more complicated with the stochastic nature of mobile-cloud execution, where signal strength variations of the wireless networks and resource interference can significantly affect real-time inference performance and system energy efficiency.

Paper
Add Code

GEVO: GPU Code Optimization using Evolutionary Computation

1 code implementation • 17 Apr 2020 • Jhe-Yu Liou, Xiaodong Wang, Stephanie Forrest, Carole-Jean Wu

If kernel output accuracy is relaxed to tolerate up to 1% error, GEVO can find kernel variants that outperform the baseline version by an average of 51. 08%.

BIG-bench Machine Learning Handwriting Recognition +1

Paper
Code

Developing a Recommendation Benchmark for MLPerf Training and Inference

no code implementations • 16 Mar 2020 • Carole-Jean Wu, Robin Burke, Ed H. Chi, Joseph Konstan, Julian McAuley, Yves Raimond, Hao Zhang

Deep learning-based recommendation models are used pervasively and broadly, for example, to recommend movies, products, or other information most relevant to users, in order to enhance the user experience.

Image Classification object-detection +3

Paper
Add Code

DeepRecSys: A System for Optimizing End-To-End At-scale Neural Recommendation Inference

no code implementations • 8 Jan 2020 • Udit Gupta, Samuel Hsia, Vikram Saraph, Xiaodong Wang, Brandon Reagen, Gu-Yeon Wei, Hsien-Hsin S. Lee, David Brooks, Carole-Jean Wu

Neural personalized recommendation is the corner-stone of a wide collection of cloud services and products, constituting significant compute demand of the cloud infrastructure.

Distributed, Parallel, and Cluster Computing

Paper
Add Code

MLPerf Inference Benchmark

4 code implementations • 6 Nov 2019 • Vijay Janapa Reddi, Christine Cheng, David Kanter, Peter Mattson, Guenther Schmuelling, Carole-Jean Wu, Brian Anderson, Maximilien Breughe, Mark Charlebois, William Chou, Ramesh Chukka, Cody Coleman, Sam Davis, Pan Deng, Greg Diamos, Jared Duke, Dave Fick, J. Scott Gardner, Itay Hubara, Sachin Idgunji, Thomas B. Jablin, Jeff Jiao, Tom St. John, Pankaj Kanwar, David Lee, Jeffery Liao, Anton Lokhmotov, Francisco Massa, Peng Meng, Paulius Micikevicius, Colin Osborne, Gennady Pekhimenko, Arun Tejusve Raghunath Rajan, Dilip Sequeira, Ashish Sirasao, Fei Sun, Hanlin Tang, Michael Thomson, Frank Wei, Ephrem Wu, Lingjie Xu, Koichi Yamada, Bing Yu, George Yuan, Aaron Zhong, Peizhao Zhang, Yuchen Zhou

Machine-learning (ML) hardware and software system demand is burgeoning.

Benchmarking

1,087

Paper
Code

MLPerf Training Benchmark

2 code implementations • 2 Oct 2019 • Peter Mattson, Christine Cheng, Cody Coleman, Greg Diamos, Paulius Micikevicius, David Patterson, Hanlin Tang, Gu-Yeon Wei, Peter Bailis, Victor Bittorf, David Brooks, Dehao Chen, Debojyoti Dutta, Udit Gupta, Kim Hazelwood, Andrew Hock, Xinyuan Huang, Atsushi Ike, Bill Jia, Daniel Kang, David Kanter, Naveen Kumar, Jeffery Liao, Guokai Ma, Deepak Narayanan, Tayo Oguntebi, Gennady Pekhimenko, Lillian Pentecost, Vijay Janapa Reddi, Taylor Robie, Tom St. John, Tsuguchika Tabaru, Carole-Jean Wu, Lingjie Xu, Masafumi Yamazaki, Cliff Young, Matei Zaharia

Machine learning (ML) needs industry-standard performance benchmarks to support design and competitive evaluation of the many emerging software and hardware solutions for ML.

Benchmarking BIG-bench Machine Learning

1,556

Paper
Code

Exploiting Parallelism Opportunities with Deep Learning Frameworks

1 code implementation • 13 Aug 2019 • Yu Emma Wang, Carole-Jean Wu, Xiaodong Wang, Kim Hazelwood, David Brooks

State-of-the-art machine learning frameworks support a wide variety of design features to enable a flexible machine learning programming interface and to ease the programmability burden on machine learning developers.

BIG-bench Machine Learning

Paper
Code

The Architectural Implications of Facebook's DNN-based Personalized Recommendation

7 code implementations • 6 Jun 2019 • Udit Gupta, Carole-Jean Wu, Xiaodong Wang, Maxim Naumov, Brandon Reagen, David Brooks, Bradford Cottel, Kim Hazelwood, Bill Jia, Hsien-Hsin S. Lee, Andrey Malevich, Dheevatsa Mudigere, Mikhail Smelyanskiy, Liang Xiong, Xuan Zhang

The widespread application of deep learning has changed the landscape of computation in the data center.

76,630

Paper
Code

Deep Learning Recommendation Model for Personalization and Recommendation Systems

18 code implementations • 31 May 2019 • Maxim Naumov, Dheevatsa Mudigere, Hao-Jun Michael Shi, Jianyu Huang, Narayanan Sundaraman, Jongsoo Park, Xiaodong Wang, Udit Gupta, Carole-Jean Wu, Alisson G. Azzolini, Dmytro Dzhulgakov, Andrey Mallevich, Ilia Cherniavskii, Yinghai Lu, Raghuraman Krishnamoorthi, Ansha Yu, Volodymyr Kondratenko, Stephanie Pereira, Xianjie Chen, Wenlin Chen, Vijay Rao, Bill Jia, Liang Xiong, Misha Smelyanskiy

With the advent of deep learning, neural network-based recommendation models have emerged as an important tool for tackling personalization and recommendation tasks.

Recommendation Systems

76,630

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.