no code implementations • 5 Oct 2023 • Chen Dun, Qiutai Pan, Shikai Jin, Ria Stevens, Mitchell D. Miller, George N. Phillips, Jr., Anastasios Kyrillidis
Determining the structure of a protein has been a decades-long open question.
no code implementations • 4 Oct 2023 • Chen Dun, Mirian Hipolito Garcia, Guoqing Zheng, Ahmed Hassan Awadallah, Anastasios Kyrillidis, Robert Sim
Large Language Models (LLMs) have the ability to solve a variety of tasks, such as text summarization and mathematical questions, just out of the box, but they are often trained with a single task in mind.
no code implementations • 7 Sep 2023 • John Chen, Chen Dun, Anastasios Kyrillidis
Advances in Semi-Supervised Learning (SSL) have almost entirely closed the gap between SSL and Supervised Learning at a fraction of the number of labels.
no code implementations • 14 Jun 2023 • Chen Dun, Mirian Hipolito Garcia, Guoqing Zheng, Ahmed Hassan Awadallah, Robert Sim, Anastasios Kyrillidis, Dimitrios Dimitriadis
Our gating function harnesses the knowledge of a pretrained model common expert to enhance its routing decisions on-the-fly.
no code implementations • 28 Oct 2022 • Qihan Wang, Chen Dun, Fangshuo Liao, Chris Jermaine, Anastasios Kyrillidis
\textsc{LoFT} is a model-parallel pretraining algorithm that partitions convolutional layers by filters to train them independently in a distributed setting, resulting in reduced memory and communication costs during pretraining.
no code implementations • 28 Oct 2022 • Chen Dun, Mirian Hipolito, Chris Jermaine, Dimitrios Dimitriadis, Anastasios Kyrillidis
Asynchronous learning protocols have regained attention lately, especially in the Federated Learning (FL) setup, where slower clients can severely impede the learning process.
no code implementations • 23 Oct 2021 • Zhenwei Dai, Chen Dun, Yuxin Tang, Anastasios Kyrillidis, Anshumali Shrivastava
Federated learning enables many local devices to train a deep learning model jointly without sharing the local data.
no code implementations • 2 Jul 2021 • Chen Dun, Cameron R. Wolfe, Christopher M. Jermaine, Anastasios Kyrillidis
Thus, ResIST reduces the per-iteration communication, memory, and time requirements of ResNet training to only a fraction of the requirements of full-model training.
1 code implementation • 20 Feb 2021 • Cameron R. Wolfe, Jingkang Yang, Arindam Chowdhury, Chen Dun, Artun Bayer, Santiago Segarra, Anastasios Kyrillidis
The graph convolutional network (GCN) is a go-to solution for machine learning on graphs, but its training is notoriously difficult to scale both in terms of graph size and the number of model parameters.
2 code implementations • 4 Oct 2019 • Binhang Yuan, Cameron R. Wolfe, Chen Dun, Yuxin Tang, Anastasios Kyrillidis, Christopher M. Jermaine
These properties of IST can cope with issues due to distributed data, slow interconnects, or limited device memory, making IST a suitable approach for cases of mandatory distribution.
4 code implementations • ICLR 2020 • Jiechuan Jiang, Chen Dun, Tiejun Huang, Zongqing Lu
The key is to understand the mutual interplay between agents.