no code implementations • 5 Oct 2021 • Jinhyun So, Ramy E. Ali, Başak Güler, A. Salman Avestimehr
A buffered asynchronous training protocol known as FedBuff has been proposed recently which bridges the gap between synchronous and asynchronous training to mitigate stragglers and to also ensure privacy simultaneously.
no code implementations • 20 Sep 2021 • Mahdi Soleymani, Ramy E. Ali, Hessam Mahdavifar, A. Salman Avestimehr
While this learning-based approach is more resource-efficient than replication, it is tailored to the specific model hosted by the cloud and is particularly suitable for a small number of queries (typically less than four) and tolerating very few (mostly one) number of stragglers.
no code implementations • 16 Sep 2021 • Ahmed Roushdy Elkordy, Saurav Prakash, A. Salman Avestimehr
As our main contribution, we propose Basil, a fast and computationally efficient Byzantine robust algorithm for decentralized training systems, which leverages a novel sequential, memory assisted and performance-based criteria for training over a logical ring while filtering the Byzantine users.
no code implementations • 27 Jan 2021 • Mahdi Soleymani, Ramy E. Ali, Hessam Mahdavifar, A. Salman Avestimehr
We further propose folded Lagrange coded computing (FLCC) to incorporate the developed techniques into a specific coded computing setting.
no code implementations • 11 Nov 2020 • Ramy E. Ali, Jinhyun So, A. Salman Avestimehr
In this work, we empirically show that the square function is not the best degree-2 polynomial that can replace the ReLU function even when restricting the polynomials to have integer coefficients.
no code implementations • NeurIPS 2020 • Jinhyun So, Basak Guler, A. Salman Avestimehr
We consider a collaborative learning scenario in which multiple data-owners wish to jointly train a logistic regression model, while keeping their individual datasets private from the other parties.
no code implementations • 30 Sep 2020 • Ahmed Roushdy Elkordy, A. Salman Avestimehr
The state-of-the-art protocols for secure model aggregation, which are based on additive masking, require all users to quantize their model updates to the same level of quantization.
Information Theory Systems and Control Systems and Control Information Theory
no code implementations • 19 Aug 2020 • Mahdi Soleymani, Hessam Mahdavifar, A. Salman Avestimehr
Also, the accuracy of outcome is characterized in a practical setting assuming operations are performed using floating-point numbers.
no code implementations • 21 Jul 2020 • Jinhyun So, Basak Guler, A. Salman Avestimehr
This presents a major challenge for the resilience of the model against adversarial (Byzantine) users, who can manipulate the global model by modifying their local models or datasets.
no code implementations • 17 Jul 2020 • Mahdi Soleymani, Hessam Mahdavifar, A. Salman Avestimehr
Then numerical results are shown for experiments on the MNIST dataset.
no code implementations • 7 Jul 2020 • Saurav Prakash, Sagar Dhakal, Mustafa Akdeniz, A. Salman Avestimehr, Nageen Himayat
Federated Learning (FL) is an exciting new paradigm that enables training a global model from data generated locally at the client nodes, without moving client data to a centralized server.
2 code implementations • NeurIPS 2020 • Seyed Mohammadreza Mousavi Kalan, Zalan Fabian, A. Salman Avestimehr, Mahdi Soltanolkotabi
In this approach a model trained for a source task, where plenty of labeled training data is available, is used as a starting point for training a model on a related target task with only few labeled training data.
no code implementations • 11 Feb 2020 • Jinhyun So, Basak Guler, A. Salman Avestimehr
A major bottleneck in scaling federated learning to a large number of users is the overhead of secure model aggregation across many users.
no code implementations • 2 Feb 2019 • Jinhyun So, Basak Guler, A. Salman Avestimehr
How to train a machine learning model while keeping the data private and secure?
no code implementations • 19 Jan 2019 • Seyed Mohammadreza Mousavi Kalan, Mahdi Soltanolkotabi, A. Salman Avestimehr
Perhaps unexpectedly, we show that QSGD maintains the fast convergence of SGD to a globally optimal model while significantly reducing the communication cost.
no code implementations • 27 Sep 2018 • Songze Li, Mingchao Yu, Chien-Sheng Yang, A. Salman Avestimehr, Sreeram Kannan, Pramod Viswanath
In particular, we propose PolyShard: ``polynomially coded sharding'' scheme that achieves information-theoretic upper bounds on the efficiency of the storage, system throughput, as well as on trust, thus enabling a truly scalable system.
Cryptography and Security Distributed, Parallel, and Cluster Computing Information Theory Information Theory
no code implementations • 24 May 2018 • Songze Li, Seyed Mohammadreza Mousavi Kalan, Qian Yu, Mahdi Soltanolkotabi, A. Salman Avestimehr
In particular, PCR requires a recovery threshold that scales inversely proportionally with the amount of computation/storage available at each worker.
no code implementations • 31 Mar 2018 • A. Salman Avestimehr, Seyed Mohammadreza Mousavi Kalan, Mahdi Soltanolkotabi
We also analyze the convergence behavior of iterative encoded optimization algorithms, allowing us to characterize fundamental trade-offs between convergence rate, size of data set, accuracy, computational load (or data redundancy), and straggler toleration in this framework.
no code implementations • 17 Oct 2017 • Qian Yu, Mohammad Ali Maddah-Ali, A. Salman Avestimehr
We consider the problem of computing the Fourier transform of high-dimensional vectors, distributedly over a cluster of machines consisting of a master node and multiple worker nodes, where the worker nodes can only store and process a fraction of the inputs.
3 code implementations • NeurIPS 2017 • Qian Yu, Mohammad Ali Maddah-Ali, A. Salman Avestimehr
We consider a large-scale matrix multiplication problem where the computation is carried out using a distributed system with a master node and multiple worker nodes, where each worker can store parts of the input matrices.
Information Theory Distributed, Parallel, and Cluster Computing Information Theory
2 code implementations • 16 Feb 2017 • Songze Li, Sucha Supittayapornpong, Mohammad Ali Maddah-Ali, A. Salman Avestimehr
We focus on sorting, which is the building block of many machine learning algorithms, and propose a novel distributed sorting algorithm, named Coded TeraSort, which substantially improves the execution time of the TeraSort benchmark in Hadoop MapReduce.
Distributed, Parallel, and Cluster Computing Information Theory Information Theory
no code implementations • 18 May 2016 • Eyal En Gad, Akshay Gadde, A. Salman Avestimehr, Antonio Ortega
A new sampling algorithm is proposed, which sequentially selects the graph nodes to be sampled, based on an aggressive search for the boundary of the signal over the graph.
no code implementations • 14 Feb 2015 • Aamir Anis, Aly El Gamal, A. Salman Avestimehr, Antonio Ortega
Graph-based methods play an important role in unsupervised and semi-supervised learning tasks by taking into account the underlying geometry of the data set.