1 code implementation • 20 Nov 2023 • Zhuosheng Zhang, Yao Yao, Aston Zhang, Xiangru Tang, Xinbei Ma, Zhiwei He, Yiming Wang, Mark Gerstein, Rui Wang, Gongshen Liu, Hai Zhao
Large language models (LLMs) have dramatically enhanced the field of language intelligence, as demonstrably evidenced by their formidable empirical performance across a spectrum of complex reasoning tasks.
no code implementations • 15 Oct 2023 • Chengwei Qin, Aston Zhang, Anirudh Dagar, Wenming Ye
The output reasoning path is then used to choose demonstrations that are prepended to the test sample for inference.
2 code implementations • 20 Sep 2023 • Zhuosheng Zhang, Aston Zhang
Autonomous user interface (UI) agents aim to facilitate task automation by interacting with the user interface without manual intervention.
1 code implementation • 21 May 2023 • Rami Aly, Xingjian Shi, Kaixiang Lin, Aston Zhang, Andrew Gordon Wilson
We observe, in the context of classification tasks, that instruction finetuned language models exhibit remarkable prompt robustness, and we subsequently propose a simple method to eliminate the need for handcrafted prompts, named AuT-Few.
1 code implementation • NeurIPS 2023 • Shuhuai Ren, Aston Zhang, Yi Zhu, Shuai Zhang, Shuai Zheng, Mu Li, Alex Smola, Xu sun
This work proposes POMP, a prompt pre-training method for vision-language models.
1 code implementation • 10 Apr 2023 • Jiaao Chen, Aston Zhang, Mu Li, Alex Smola, Diyi Yang
Diffusion models that are based on iterative denoising have been recently proposed and leveraged in various generation tasks like image generation.
1 code implementation • 8 Feb 2023 • Chengwei Qin, Aston Zhang, Zhuosheng Zhang, Jiaao Chen, Michihiro Yasunaga, Diyi Yang
Spurred by advancements in scale, large language models (LLMs) have demonstrated the ability to perform a variety of natural language processing (NLP) tasks zero-shot -- i. e., without adaptation on downstream data.
1 code implementation • 6 Feb 2023 • Taojiannan Yang, Yi Zhu, Yusheng Xie, Aston Zhang, Chen Chen, Mu Li
Recent vision transformer based video models mostly follow the ``image pre-training then finetuning" paradigm and have achieved great success on multiple video benchmarks.
Ranked #2 on Action Recognition on Diving-48 (using extra training data)
3 code implementations • 2 Feb 2023 • Zhuosheng Zhang, Aston Zhang, Mu Li, Hai Zhao, George Karypis, Alex Smola
Large language models (LLMs) have shown impressive performance on complex reasoning by leveraging chain-of-thought (CoT) prompting to generate intermediate reasoning chains as the rationale to infer the answer.
Ranked #3 on Science Question Answering on ScienceQA
no code implementations • 4 Jan 2023 • Jiaao Chen, Aston Zhang, Xingjian Shi, Mu Li, Alex Smola, Diyi Yang
We discover the following design patterns: (i) group layers in a spindle pattern; (ii) allocate the number of trainable parameters to layers uniformly; (iii) tune all the groups; (iv) assign proper tuning strategies to different groups.
1 code implementation • 29 Dec 2022 • Zichang Liu, Zhiqiang Tang, Xingjian Shi, Aston Zhang, Mu Li, Anshumali Shrivastava, Andrew Gordon Wilson
The ability to jointly learn from multiple modalities, such as text, audio, and visual data, is a defining feature of intelligent systems.
no code implementations • 21 Dec 2022 • M Saiful Bari, Aston Zhang, Shuai Zheng, Xingjian Shi, Yi Zhu, Shafiq Joty, Mu Li
Pre-trained large language models can efficiently interpolate human-written prompts in a natural way.
no code implementations • 10 Dec 2022 • Chaoyang He, Shuai Zheng, Aston Zhang, George Karypis, Trishul Chilimbi, Mahdi Soltanolkotabi, Salman Avestimehr
The mixture of Expert (MoE) parallelism is a recent advancement that scales up the model size with constant computational cost.
1 code implementation • 12 Oct 2022 • Haotao Wang, Junyuan Hong, Aston Zhang, Jiayu Zhou, Zhangyang Wang
As a result, both the stem and the classification head in the final network are hardly affected by backdoor training samples.
5 code implementations • 7 Oct 2022 • Zhuosheng Zhang, Aston Zhang, Mu Li, Alex Smola
Providing these steps for prompting demonstrations is called chain-of-thought (CoT) prompting.
1 code implementation • 4 Jul 2022 • Haotao Wang, Aston Zhang, Shuai Zheng, Xingjian Shi, Mu Li, Zhangyang Wang
In addition, NoFrost achieves a $23. 56\%$ adversarial robustness against PGD attack, which improves the $13. 57\%$ robustness in BN-based AT.
1 code implementation • 4 Jul 2022 • Haotao Wang, Aston Zhang, Yi Zhu, Shuai Zheng, Mu Li, Alex Smola, Zhangyang Wang
However, in real-world applications, it is common for the training sets to have long-tailed distributions.
1 code implementation • 16 Jun 2022 • Xiaoshuai Hao, Yi Zhu, Srikar Appalaraju, Aston Zhang, Wanqian Zhang, Bo Li, Mu Li
Data augmentation is a necessity to enhance data efficiency in deep learning.
no code implementations • NeurIPS 2021 • Aston Zhang, Yi Tay, Yikang Shen, Alvin Chan Guo Wei, Shuai Zhang
On the other hand, the extent of the Self-IRU recursion is controlled by gates whose values are between 0 and 1 and may vary across the temporal dimension of sequences, enabling dynamic soft recursion depth at each time step.
4 code implementations • 8 Oct 2021 • Eleonora Grassucci, Aston Zhang, Danilo Comminiello
In this paper, we define the parameterization of hypercomplex convolutional layers and introduce the family of parameterized hypercomplex neural networks (PHNNs) that are lightweight and efficient large-scale models.
Ranked #1 on Sound Event Detection on L3DAS21
no code implementations • ACL 2021 • Aston Zhang, Alvin Chan, Yi Tay, Jie Fu, Shuohang Wang, Shuai Zhang, Huajie Shao, Shuochao Yao, Roy Ka-Wei Lee
Orthogonality constraints encourage matrices to be orthogonal for numerical stability.
1 code implementation • 21 Jun 2021 • Aston Zhang, Zachary C. Lipton, Mu Li, Alexander J. Smola
This open-source book represents our attempt to make deep learning approachable, teaching readers the concepts, the context, and the code.
no code implementations • 23 Feb 2021 • Huajie Shao, Jun Wang, Haohong Lin, Xuezhou Zhang, Aston Zhang, Heng Ji, Tarek Abdelzaher
The algorithm is injected into a Conditional Variational Autoencoder (CVAE), allowing \textit{Apex} to control both (i) the order of keywords in the generated sentences (conditioned on the input keywords and their order), and (ii) the trade-off between diversity and accuracy.
3 code implementations • 17 Feb 2021 • Aston Zhang, Yi Tay, Shuai Zhang, Alvin Chan, Anh Tuan Luu, Siu Cheung Hui, Jie Fu
Recent works have demonstrated reasonable success of representation learning in hypercomplex space.
2 code implementations • 12 Feb 2021 • Tianlong Chen, Yongduo Sui, Xuxi Chen, Aston Zhang, Zhangyang Wang
With graphs rapidly growing in size and deeper graph neural networks (GNNs) emerging, the training and inference of GNNs become increasingly expensive.
no code implementations • 1 Jan 2021 • Yi Tay, Yikang Shen, Alvin Chan, Aston Zhang, Shuai Zhang
This paper explores an intriguing idea of recursively parameterizing recurrent nets.
no code implementations • ICLR 2021 • Aston Zhang, Yi Tay, Shuai Zhang, Alvin Chan, Anh Tuan Luu, Siu Hui, Jie Fu
Recent works have demonstrated reasonable success of representation learning in hypercomplex space.
3 code implementations • 11 Nov 2020 • Shuai Zhang, Huoyu Liu, Aston Zhang, Yue Hu, Ce Zhang, Yumeng Li, Tanchao Zhu, Shaojian He, Wenwu Ou
Furthermore, we present two variants of hypercuboids to enhance the capability in capturing the diversities of user interests.
4 code implementations • 31 Oct 2020 • Huajie Shao, Zhisheng Xiao, Shuochao Yao, Aston Zhang, Shengzhong Liu, Tarek Abdelzaher
ControlVAE is a new variational autoencoder (VAE) framework that combines the automatic control theory with the basic VAE to stabilize the KL-divergence of VAE models to a specified value.
2 code implementations • 24 Oct 2020 • Zhiqiang Hu, Roy Ka-Wei Lee, Charu C. Aggarwal, Aston Zhang
This article aims to provide a comprehensive review of recent research efforts on text style transfer.
2 code implementations • Findings of the Association for Computational Linguistics 2020 • Alvin Chan, Yi Tay, Yew-Soon Ong, Aston Zhang
This paper demonstrates a fatal vulnerability in natural language inference (NLI) and text classification systems.
1 code implementation • ICLR 2021 • Alvin Chan, Yew-Soon Ong, Bill Pung, Aston Zhang, Jie Fu
While there are studies that seek to control high-level attributes (such as sentiment and topic) of generated text, there is still a lack of more precise control over its content at the word- and phrase-level.
no code implementations • 13 Apr 2020 • Huajie Shao, Dachun Sun, Jiahao Wu, Zecheng Zhang, Aston Zhang, Shuochao Yao, Shengzhong Liu, Tianshi Wang, Chao Zhang, Tarek Abdelzaher
Motivated by this trend, we describe a novel item-item cross-platform recommender system, $\textit{paper2repo}$, that recommends relevant repositories on GitHub that match a given paper in an academic search system such as Microsoft Academic.
no code implementations • ICML 2020 • Huajie Shao, Shuochao Yao, Dachun Sun, Aston Zhang, Shengzhong Liu, Dongxin Liu, Jun Wang, Tarek Abdelzaher
Variational Autoencoders (VAE) and their variants have been widely used in a variety of applications, such as dialog generation, image generation and disentangled representation learning.
1 code implementation • 14 Feb 2020 • Chenguang Wang, Zihao Ye, Aston Zhang, Zheng Zhang, Alexander J. Smola
Transformer has been widely used thanks to its ability to capture sequence information in an efficient way.
no code implementations • NeurIPS 2019 • Yi Tay, Anh Tuan Luu, Aston Zhang, Shuohang Wang, Siu Cheung Hui
Attentional models are distinctly characterized by their ability to learn relative importance, i. e., assigning a different weight to input values.
no code implementations • 25 Sep 2019 • Yi Tay, Aston Zhang, Shuai Zhang, Alvin Chan, Luu Anh Tuan, Siu Cheung Hui
We propose R2D2 layers, a new neural block for training efficient NLP models.
no code implementations • 18 Aug 2019 • Ahmed El-Kishky, Frank Xu, Aston Zhang, Jiawei Han
However, in many languages and specialized corpora, words are composed by concatenating semantically meaningful subword structures.
4 code implementations • 9 Jul 2019 • Jian Guo, He He, Tong He, Leonard Lausen, Mu Li, Haibin Lin, Xingjian Shi, Chenguang Wang, Junyuan Xie, Sheng Zha, Aston Zhang, Hang Zhang, Zhi Zhang, Zhongyue Zhang, Shuai Zheng, Yi Zhu
We present GluonCV and GluonNLP, the deep learning toolkits for computer vision and natural language processing based on Apache MXNet (incubating).
2 code implementations • NeurIPS 2021 • Yunhui Long, Boxin Wang, Zhuolin Yang, Bhavya Kailkhura, Aston Zhang, Carl A. Gunter, Bo Li
In particular, we train a student data generator with an ensemble of teacher discriminators and propose a novel private gradient aggregation mechanism to ensure differential privacy on all information that flows from teacher discriminators to the student generator.
1 code implementation • ACL 2019 • Yi Tay, Aston Zhang, Luu Anh Tuan, Jinfeng Rao, Shuai Zhang, Shuohang Wang, Jie Fu, Siu Cheung Hui
Many state-of-the-art neural models for NLP are heavily parameterized and thus memory inefficient.
no code implementations • 6 Jun 2019 • Shuai Zhang, Lina Yao, Lucas Vinh Tran, Aston Zhang, Yi Tay
All in all, we conduct extensive experiments on six real-world datasets, demonstrating the effectiveness of Quaternion algebra in recommender systems.
no code implementations • ACL 2019 • Yi Tay, Shuohang Wang, Luu Anh Tuan, Jie Fu, Minh C. Phan, Xingdi Yuan, Jinfeng Rao, Siu Cheung Hui, Aston Zhang
This paper tackles the problem of reading comprehension over long narratives where documents easily span over thousands of tokens.
no code implementations • WS 2018 • Ahmed El-Kishky, Frank Xu, Aston Zhang, Stephen Macke, Jiawei Han
Recent literature has shown a wide variety of benefits to mapping traditional one-hot representations of words and phrases to lower-dimensional real-valued vectors known as word embeddings.
no code implementations • 9 Mar 2018 • Huan Gui, Qi Zhu, Liyuan Liu, Aston Zhang, Jiawei Han
We study the task of expert finding in heterogeneous bibliographical networks based on two aspects: textual content analysis and authority ranking.
no code implementations • 9 Sep 2017 • Shuochao Yao, Yiran Zhao, Huajie Shao, Aston Zhang, Chao Zhang, Shen Li, Tarek Abdelzaher
Recent advances in deep learning have led various applications to unprecedented achievements, which could potentially bring higher intelligence to a broad spectrum of mobile and ubiquitous applications.
1 code implementation • 5 Jun 2017 • Shuochao Yao, Yiran Zhao, Aston Zhang, Lu Su, Tarek Abdelzaher
It is thus able to shorten execution time by 71. 4% to 94. 5%, and decrease energy consumption by 72. 2% to 95. 7%.
1 code implementation • 7 Nov 2016 • Shuochao Yao, Shaohan Hu, Yiran Zhao, Aston Zhang, Tarek Abdelzaher
For many mobile applications, it is hard to find a distribution that exactly describes the noise in practice.