no code implementations • 14 Apr 2024 • Siyuan Feng, Jiawei Liu, Ruihang Lai, Charlie F. Ruan, Yong Yu, Lingming Zhang, Tianqi Chen
While a traditional bottom-up development pipeline fails to close the gap timely, we introduce TapML, a top-down approach and tooling designed to streamline the deployment of ML systems on diverse platforms, optimized for developer productivity.
1 code implementation • 13 Feb 2024 • Shentao Yang, Tianqi Chen, Mingyuan Zhou
Aligning text-to-image diffusion model (T2I) with preference has been gaining increasing research attention.
no code implementations • 25 Jan 2024 • Bo Zhou, Jun Hou, Tianqi Chen, Yinchi Zhou, Xiongchao Chen, Huidong Xie, Qiong Liu, Xueqi Guo, Yu-Jung Tsai, Vladimir Y. Panin, Takuya Toyonaga, James S. Duncan, Chi Liu
Low-dose PET offers a valuable means of minimizing radiation exposure in PET imaging.
no code implementations • 23 Dec 2023 • Xupeng Miao, Gabriele Oliaro, Zhihao Zhang, Xinhao Cheng, Hongyi Jin, Tianqi Chen, Zhihao Jia
In the rapidly evolving landscape of artificial intelligence (AI), generative large language models (LLMs) stand at the forefront, revolutionizing how we interact with our data.
no code implementations • 3 Dec 2023 • Tianqi Chen, Yongfei Liu, Zhendong Wang, Jianbo Yuan, Quanzeng You, Hongxia Yang, Mingyuan Zhou
In light of the remarkable success of in-context learning in large language models, its potential extension to the vision domain, particularly with visual foundation models like Stable Diffusion, has sparked considerable interest.
no code implementations • 1 Nov 2023 • Ruihang Lai, Junru Shao, Siyuan Feng, Steven S. Lyubomirsky, Bohan Hou, Wuwei Lin, Zihao Ye, Hongyi Jin, Yuchen Jin, Jiawei Liu, Lesheng Jin, Yaxing Cai, Ziheng Jiang, Yong Wu, Sunghyun Park, Prakalp Srivastava, Jared G. Roesch, Todd C. Mowry, Tianqi Chen
Dynamic shape computations have become critical in modern machine learning workloads, especially in emerging large language models.
1 code implementation • 29 Oct 2023 • Yilong Zhao, Chien-Yu Lin, Kan Zhu, Zihao Ye, Lequn Chen, Size Zheng, Luis Ceze, Arvind Krishnamurthy, Tianqi Chen, Baris Kasikci
To maximize LLMs' serving throughput, we introduce Atom, a low-bit quantization method that achieves high throughput improvements with negligible accuracy loss.
1 code implementation • NeurIPS 2023 • Mingyuan Zhou, Tianqi Chen, Zhendong Wang, Huangjie Zheng
We introduce beta diffusion, a novel generative modeling method that integrates demasking and denoising to generate data within bounded ranges.
1 code implementation • 28 May 2023 • Tianqi Chen, Mingyuan Zhou
However, it is found in this paper to have limited ability in modeling some other types of data, such as count and non-negative continuous data, that are often highly sparse, skewed, heavy-tailed, and/or overdispersed.
no code implementations • 17 May 2023 • Pratik Fegade, Tianqi Chen, Phillip B. Gibbons, Todd C. Mowry
Dynamic control flow is an important technique often used to design expressive and efficient deep learning computations for applications such as text parsing, machine translation, exiting early out of deep models and so on.
no code implementations • 8 Feb 2023 • Siyuan Chen, Pratik Fegade, Tianqi Chen, Phillip B. Gibbons, Todd C. Mowry
Further, batching puts strict restrictions on memory adjacency and can lead to high data movement costs.
no code implementations • 25 Aug 2022 • Elias Jääsaari, Michelle Ma, Ameet Talwalkar, Tianqi Chen
There is a growing need to deploy machine learning for different tasks on a wide array of new hardware platforms.
2 code implementations • 11 Jul 2022 • Zihao Ye, Ruihang Lai, Junru Shao, Tianqi Chen, Luis Ceze
We propose SparseTIR, a sparse tensor compilation abstraction that offers composable formats and composable transformations for deep learning workloads.
2 code implementations • 9 Jul 2022 • Siyuan Feng, Bohan Hou, Hongyi Jin, Wuwei Lin, Junru Shao, Ruihang Lai, Zihao Ye, Lianmin Zheng, Cody Hao Yu, Yong Yu, Tianqi Chen
Finally, we build an end-to-end framework on top of our abstraction to automatically optimize deep learning models for given tensor computation primitives.
no code implementations • 26 May 2022 • Junru Shao, Xiyou Zhou, Siyuan Feng, Bohan Hou, Ruihang Lai, Hongyi Jin, Wuwei Lin, Masahiro Masuda, Cody Hao Yu, Tianqi Chen
Experimental results show that MetaSchedule can cover the search space used in the state-of-the-art tensor program optimization frameworks in a modular way.
1 code implementation • 28 Mar 2022 • Tianning Zhang, Tianqi Chen, Erping Li, Bo Yang, L. K. Ang
The tensor network, as a facterization of tensors, aims at performing the operations that are common for normal tensors, such as addition, contraction and stacking.
1 code implementation • 1 Nov 2021 • Byungsoo Jeon, Sunghyun Park, Peiyuan Liao, Sheng Xu, Tianqi Chen, Zhihao Jia
Given the fast-evolving nature of the DL ecosystem, this manual approach often slows down continuous innovations across different layers; it prevents hardware vendors from the fast deployment of their cutting-edge libraries, DL framework developers must repeatedly adjust their hand-coded rules to accommodate new versions of libraries, and machine learning practitioners need to wait for the integration of new technologies and often encounter unsatisfactory performance.
no code implementations • 19 Oct 2021 • Pratik Fegade, Tianqi Chen, Phillip B. Gibbons, Todd C. Mowry
There is often variation in the shape and size of input data used for deep learning.
no code implementations • 15 Aug 2021 • Ren Wang, Tianqi Chen, Alfred Hero
Recent works have theoretically and empirically shown that deep neural networks (DNNs) have an inherent vulnerability to small perturbations.
no code implementations • 27 Jun 2021 • Ren Wang, Tianqi Chen, Stephen Lindsly, Cooper Stansbury, Indika Rajapakse, Alfred Hero
This immuno-mimetic model leads to a new computational biology framework for robustification of deep neural networks against adversarial attacks.
1 code implementation • 27 Jun 2021 • Ren Wang, Tianqi Chen, Stephen Lindsly, Cooper Stansbury, Alnawaz Rehemtulla, Indika Rajapakse, Alfred Hero
Initializing a population of exemplars that is balanced across classes, RAILS starts from a uniform label distribution that encourages diversity and uses an evolutionary optimization process to adaptively adjust the predictive label distribution in a manner that emulates the way the natural immune system recognizes novel pathogens.
1 code implementation • 27 Jun 2021 • Ren Wang, Tianqi Chen, Philip Yao, Sijia Liu, Indika Rajapakse, Alfred Hero
K-Nearest Neighbor (kNN)-based deep learning methods have been applied to many applications due to their simplicity and geometric interpretability.
no code implementations • 27 Mar 2021 • Ziheng Jiang, Animesh Jain, Andrew Liu, Josh Fromm, Chengqian Ma, Tianqi Chen, Luis Ceze
Quantization is a key technique to reduce the resource requirement and improve the performance of neural network deployment.
no code implementations • 18 Dec 2020 • Ren Wang, Tianqi Chen, Stephen Lindsly, Alnawaz Rehemtulla, Alfred Hero, Indika Rajapakse
RAILS incorporates an Adaptive Immune System Emulation (AISE), which emulates in silico the biological mechanisms that are used to defend the host against attacks by pathogens.
no code implementations • 2 Nov 2020 • Pratik Fegade, Tianqi Chen, Phillip B. Gibbons, Todd C. Mowry
Optimizing deep learning models is generally performed in two steps: (i) high-level graph optimizations such as kernel fusion and (ii) low level kernel optimizations such as those found in vendor libraries.
1 code implementation • ICLR 2021 • Marisa Kirisame, Steven Lyubomirsky, Altan Haan, Jennifer Brennan, Mike He, Jared Roesch, Tianqi Chen, Zachary Tatlock
Checkpointing enables the training of deep learning models under restricted memory budgets by freeing intermediate activations from memory and recomputing them on demand.
no code implementations • 17 Apr 2019 • Jared Roesch, Steven Lyubomirsky, Marisa Kirisame, Logan Weber, Josh Pollock, Luis Vega, Ziheng Jiang, Tianqi Chen, Thierry Moreau, Zachary Tatlock
Using these extension mechanisms, Relay supports a unified compiler that can target a variety of hardware platforms.
no code implementations • 5 Dec 2018 • Ignacio Cano, Lequn Chen, Pedro Fonseca, Tianqi Chen, Chern Cheah, Karan Gupta, Ramesh Chandra, Arvind Krishnamurthy
Our large-scale analysis confirms that VMs are often misconfigured, either overprovisioned or underprovisioned, and that this problem is pervasive across a wide range of private clusters.
no code implementations • 25 Oct 2018 • Meghan Cowan, Thierry Moreau, Tianqi Chen, Luis Ceze
To date, none of the popular deep learning directly support low precision operators, partly due to a lack of optimized low precision libraries.
no code implementations • 26 Sep 2018 • Jared Roesch, Steven Lyubomirsky, Logan Weber, Josh Pollock, Marisa Kirisame, Tianqi Chen, Zachary Tatlock
Machine learning powers diverse services in industry including search, translation, recommendation systems, and security.
no code implementations • 11 Jul 2018 • Thierry Moreau, Tianqi Chen, Luis Vega, Jared Roesch, Eddie Yan, Lianmin Zheng, Josh Fromm, Ziheng Jiang, Luis Ceze, Carlos Guestrin, Arvind Krishnamurthy
Specialized Deep Learning (DL) acceleration stacks, designed for a specific set of frameworks, model architectures, operators, and data types, offer the allure of high performance while sacrificing flexibility.
no code implementations • NeurIPS 2018 • Tianqi Chen, Lianmin Zheng, Eddie Yan, Ziheng Jiang, Thierry Moreau, Luis Ceze, Carlos Guestrin, Arvind Krishnamurthy
Efficient implementations of tensor operators, such as matrix multiplication and high dimensional convolution, are key enablers of effective deep learning systems.
1 code implementation • 12 Feb 2018 • Tianqi Chen, Thierry Moreau, Ziheng Jiang, Lianmin Zheng, Eddie Yan, Meghan Cowan, Haichen Shen, Leyuan Wang, Yuwei Hu, Luis Ceze, Carlos Guestrin, Arvind Krishnamurthy
Experimental results show that TVM delivers performance across hardware back-ends that are competitive with state-of-the-art, hand-tuned libraries for low-power CPU, mobile GPU, and server-class GPUs.
6 code implementations • 21 Apr 2016 • Tianqi Chen, Bing Xu, Chiyuan Zhang, Carlos Guestrin
In the extreme case, our analysis also shows that the memory consumption can be reduced to O(log n) with as little as O(n log n) extra cost for forward computation.
25 code implementations • 9 Mar 2016 • Tianqi Chen, Carlos Guestrin
In this paper, we describe a scalable end-to-end tree boosting system called XGBoost, which is used widely by data scientists to achieve state-of-the-art results on many machine learning challenges.
2 code implementations • 3 Dec 2015 • Tianqi Chen, Mu Li, Yutian Li, Min Lin, Naiyan Wang, Minjie Wang, Tianjun Xiao, Bing Xu, Chiyuan Zhang, Zheng Zhang
This paper describes both the API design and the system implementation of MXNet, and explains how embedding of both symbolic expression and tensor operation is handled in a unified fashion.
3 code implementations • 18 Nov 2015 • Tianqi Chen, Ian Goodfellow, Jonathon Shlens
Our Net2Net technique accelerates the experimentation process by instantaneously transferring the knowledge from a previous network to each new deeper or wider network.
no code implementations • NeurIPS 2015 • Yi-An Ma, Tianqi Chen, Emily B. Fox
That is, any continuous Markov process that provides samples from the target distribution can be written in our framework.
2 code implementations • 5 May 2015 • Bing Xu, Naiyan Wang, Tianqi Chen, Mu Li
In this paper we investigate the performance of different types of rectified activation functions in convolutional neural network: standard rectified linear unit (ReLU), leaky rectified linear unit (Leaky ReLU), parametric rectified linear unit (PReLU) and a new randomized leaky rectified linear units (RReLU).
Ranked #190 on Image Classification on CIFAR-100
no code implementations • 22 Oct 2014 • Jingbo Shang, Tianqi Chen, Hang Li, Zhengdong Lu, Yong Yu
In this paper, we tackle this challenge with a novel parallel and efficient algorithm for feature-based matrix factorization.
5 code implementations • 17 Feb 2014 • Tianqi Chen, Emily B. Fox, Carlos Guestrin
Hamiltonian Monte Carlo (HMC) sampling methods provide a mechanism for defining distant proposals with high acceptance probabilities in a Metropolis-Hastings framework, enabling more efficient exploration of the state space than standard random-walk proposals.
no code implementations • LREC 2012 • Behrang QasemiZadeh, Paul Buitelaar, Tianqi Chen, Georgeta Bordea
In this paper, we address the problem of extracting technical terms automatically from an unannotated corpus.
no code implementations • 11 Sep 2011 • Tianqi Chen, Zhao Zheng, Qiuxia Lu, Weinan Zhang, Yong Yu
Recommender system has been more and more popular and widely used in many applications recently.