no code implementations • 13 Oct 2023 • Chenxu Yang, Zheng Lin, Lanrui Wang, Chong Tian, Liang Pang, Jiangnan Li, Qirong Ho, Yanan Cao, Weiping Wang
Knowledge-grounded dialogue generation aims to mitigate the issue of text degeneration by incorporating external knowledge to supplement the context.
no code implementations • 10 Nov 2022 • Yonghao Zhuang, Hexu Zhao, Lianmin Zheng, Zhuohan Li, Eric P. Xing, Qirong Ho, Joseph E. Gonzalez, Ion Stoica, Hao Zhang
This pattern emerges when the two paradigms of model parallelism - intra-operator and inter-operator parallelism - are combined to support large models on large clusters.
2 code implementations • 27 Aug 2020 • Aurick Qiao, Sang Keun Choe, Suhas Jayaram Subramanya, Willie Neiswanger, Qirong Ho, Hao Zhang, Gregory R. Ganger, Eric P. Xing
Some recent schedulers choose job resources for users, but do so without awareness of how DL training can be re-optimized to better utilize the provided resources.
no code implementations • 11 Dec 2017 • Hao Zhang, Shizhen Xu, Graham Neubig, Wei Dai, Qirong Ho, Guangwen Yang, Eric P. Xing
Recent deep learning (DL) models have moved beyond static network architectures to dynamic ones, handling data where the network structure changes every example, such as sequences of variable lengths, trees, and graphs.
no code implementations • 11 Jun 2017 • Hao Zhang, Zeyu Zheng, Shizhen Xu, Wei Dai, Qirong Ho, Xiaodan Liang, Zhiting Hu, Jinliang Wei, Pengtao Xie, Eric P. Xing
We show that Poseidon enables Caffe and TensorFlow to achieve 15. 5x speed-up on 16 single-GPU machines, even with limited bandwidth (10GbE) and the challenging VGG19-22K network for image classification.
no code implementations • 13 Dec 2016 • Sulin Liu, Sinno Jialin Pan, Qirong Ho
Due to heavy communication caused by transmitting the data and the issue of data privacy and security, it is impossible to send data of different task to a master machine to perform multi-task learning.
no code implementations • 31 Dec 2015 • Eric P. Xing, Qirong Ho, Pengtao Xie, Wei Dai
Taking the view that Big ML systems can benefit greatly from ML-rooted statistical and algorithmic insights --- and that ML researchers should therefore not shy away from such systems design --- we discuss a series of principles and strategies distilled from our recent efforts on industrial-scale ML solutions.
no code implementations • 19 Dec 2015 • Hao Zhang, Zhiting Hu, Jinliang Wei, Pengtao Xie, Gunhee Kim, Qirong Ho, Eric Xing
To investigate how to adapt existing frameworks to efficiently support distributed GPUs, we propose Poseidon, a scalable system architecture for distributed inter-machine communication in existing DL frameworks.
no code implementations • 26 Nov 2015 • Pengtao Xie, Jin Kyu Kim, Yi Zhou, Qirong Ho, Abhimanu Kumar, Yao-Liang Yu, Eric Xing
Matrix-parametrized models, including multiclass logistic regression and sparse coding, are used in machine learning (ML) applications ranging from computer vision to computational biology.
1 code implementation • 4 Dec 2014 • Jinhui Yuan, Fei Gao, Qirong Ho, Wei Dai, Jinliang Wei, Xun Zheng, Eric P. Xing, Tie-Yan Liu, Wei-Ying Ma
When building large-scale machine learning (ML) programs, such as big topic models or deep neural nets, one usually assumes such tasks can only be attempted with industrial-sized clusters with thousands of nodes, which are out of reach for most practitioners or academic researchers.
no code implementations • NeurIPS 2014 • Kumar Avinava Dubey, Qirong Ho, Sinead A. Williamson, Eric P. Xing
Hierarchical clustering methods offer an intuitive and powerful way to model a wide variety of data sets.
no code implementations • NeurIPS 2014 • Seunghak Lee, Jin Kyu Kim, Xun Zheng, Qirong Ho, Garth A. Gibson, Eric P. Xing
Distributed machine learning has typically been approached from a data parallel perspective, where big data are partitioned to multiple workers and an algorithm is executed concurrently over different data subsets under various synchronization schemes to ensure speed-up and/or correctness.
no code implementations • 10 Nov 2014 • Xun Zheng, Jin Kyu Kim, Qirong Ho, Eric P. Xing
In real world industrial applications of topic modeling, the ability to capture gigantic conceptual space by learning an ultra-high dimensional topical representation, i. e., the so-called "big model", is becoming the next desideratum after enthusiasms on "big data", especially for fine-grained downstream tasks such as online advertising, where good performances are usually achieved by regression-based predictors built on millions if not billions of input features.
no code implementations • 29 Oct 2014 • Wei Dai, Abhimanu Kumar, Jinliang Wei, Qirong Ho, Garth Gibson, Eric P. Xing
As Machine Learning (ML) applications increase in data size and model complexity, practitioners turn to distributed clusters to satisfy the increased computational and memory demands.
no code implementations • 19 Sep 2014 • Pengtao Xie, Jin Kyu Kim, Yi Zhou, Qirong Ho, Abhimanu Kumar, Yao-Liang Yu, Eric Xing
Matrix-parametrized models, including multiclass logistic regression and sparse coding, are used in machine learning (ML) applications ranging from computer vision to computational biology.
no code implementations • 18 Jun 2014 • Seunghak Lee, Jin Kyu Kim, Xun Zheng, Qirong Ho, Garth A. Gibson, Eric P. Xing
When training large machine learning models with many variables or parameters, a single machine is often inadequate since the model may be too large to fit in memory, while training can take a long time even with stochastic updates.
no code implementations • 30 Dec 2013 • Eric P. Xing, Qirong Ho, Wei Dai, Jin Kyu Kim, Jinliang Wei, Seunghak Lee, Xun Zheng, Pengtao Xie, Abhimanu Kumar, Yao-Liang Yu
What is a systematic way to efficiently apply a wide spectrum of advanced ML programs to industrial scale problems, using Big Models (up to 100s of billions of parameters) on Big Data (up to terabytes or petabytes)?
no code implementations • 30 Dec 2013 • Jinliang Wei, Wei Dai, Abhimanu Kumar, Xun Zheng, Qirong Ho, Eric P. Xing
Many ML algorithms fall into the category of \emph{iterative convergent algorithms} which start from a randomly chosen initial point and converge to optima by repeating iteratively a set of procedures.
no code implementations • 19 Dec 2013 • Seunghak Lee, Jin Kyu Kim, Qirong Ho, Garth A. Gibson, Eric P. Xing
Training large machine learning (ML) models with many variables or parameters can take a long time if one employs sequential procedures even with stochastic updates.
no code implementations • NeurIPS 2013 • Qirong Ho, James Cipar, Henggang Cui, Seunghak Lee, Jin Kyu Kim, Phillip B. Gibbons, Garth A. Gibson, Greg Ganger, Eric P. Xing
We propose a parameter server system for distributed ML, which follows a Stale Synchronous Parallel (SSP) model of computation that maximizes the time computational workers spend doing useful work on ML algorithms, while still providing correctness guarantees.
no code implementations • NeurIPS 2013 • Junming Yin, Qirong Ho, Eric P. Xing
We propose a scalable approach for making inference about latent spaces of large networks.
no code implementations • NeurIPS 2012 • Qirong Ho, Junming Yin, Eric P. Xing
A triangular motif is a vertex triple containing 2 or 3 edges, and the number of such motifs is $\Theta(\sum_{i}D_{i}^{2})$ (where $D_i$ is the degree of vertex $i$), which is much smaller than $N^2$ for low-maximum-degree networks.