Search Results for author: Aston Zhang

Found 48 papers, 29 papers with code

Igniting Language Intelligence: The Hitchhiker's Guide From Chain-of-Thought Reasoning to Language Agents

1 code implementation • 20 Nov 2023 • Zhuosheng Zhang, Yao Yao, Aston Zhang, Xiangru Tang, Xinbei Ma, Zhiwei He, Yiming Wang, Mark Gerstein, Rui Wang, Gongshen Liu, Hai Zhao

Large language models (LLMs) have dramatically enhanced the field of language intelligence, as demonstrably evidenced by their formidable empirical performance across a spectrum of complex reasoning tasks.

307

Paper
Code

In-Context Learning with Iterative Demonstration Selection

no code implementations • 15 Oct 2023 • Chengwei Qin, Aston Zhang, Anirudh Dagar, Wenming Ye

The output reasoning path is then used to choose demonstrations that are prepended to the test sample for inference.

Few-Shot Learning In-Context Learning +3

Paper
Add Code

You Only Look at Screens: Multimodal Chain-of-Action Agents

2 code implementations • 20 Sep 2023 • Zhuosheng Zhang, Aston Zhang

Autonomous user interface (UI) agents aim to facilitate task automation by interacting with the user interface without manual intervention.

Type prediction

133

Paper
Code

Automated Few-shot Classification with Instruction-Finetuned Language Models

1 code implementation • 21 May 2023 • Rami Aly, Xingjian Shi, Kaixiang Lin, Aston Zhang, Andrew Gordon Wilson

We observe, in the context of classification tasks, that instruction finetuned language models exhibit remarkable prompt robustness, and we subsequently propose a simple method to eliminate the need for handcrafted prompts, named AuT-Few.

Classification Few-Shot Learning +1

Paper
Code

Prompt Pre-Training with Twenty-Thousand Classes for Open-Vocabulary Visual Recognition

1 code implementation • NeurIPS 2023 • Shuhuai Ren, Aston Zhang, Yi Zhu, Shuai Zhang, Shuai Zheng, Mu Li, Alex Smola, Xu sun

This work proposes POMP, a prompt pre-training method for vision-language models.

Ranked #1 on Open Vocabulary Semantic Segmentation on COCO-Stuff-171

Image Classification object-detection +3

245

Paper
Code

A Cheaper and Better Diffusion Language Model with Soft-Masked Noise

1 code implementation • 10 Apr 2023 • Jiaao Chen, Aston Zhang, Mu Li, Alex Smola, Diyi Yang

Diffusion models that are based on iterative denoising have been recently proposed and leveraged in various generation tasks like image generation.

Denoising Image Generation +1

Paper
Code

Is ChatGPT a General-Purpose Natural Language Processing Task Solver?

1 code implementation • 8 Feb 2023 • Chengwei Qin, Aston Zhang, Zhuosheng Zhang, Jiaao Chen, Michihiro Yasunaga, Diyi Yang

Spurred by advancements in scale, large language models (LLMs) have demonstrated the ability to perform a variety of natural language processing (NLP) tasks zero-shot -- i. e., without adaptation on downstream data.

Arithmetic Reasoning Zero-Shot Learning

210

Paper
Code

AIM: Adapting Image Models for Efficient Video Action Recognition

1 code implementation • 6 Feb 2023 • Taojiannan Yang, Yi Zhu, Yusheng Xie, Aston Zhang, Chen Chen, Mu Li

Recent vision transformer based video models mostly follow the ``image pre-training then finetuning" paradigm and have achieved great success on multiple video benchmarks.

Ranked #2 on Action Recognition on Diving-48 (using extra training data)

Action Classification Action Recognition +2

240

Paper
Code

Multimodal Chain-of-Thought Reasoning in Language Models

3 code implementations • 2 Feb 2023 • Zhuosheng Zhang, Aston Zhang, Mu Li, Hai Zhao, George Karypis, Alex Smola

Large language models (LLMs) have shown impressive performance on complex reasoning by leveraging chain-of-thought (CoT) prompting to generate intermediate reasoning chains as the rationale to infer the answer.

Ranked #3 on Science Question Answering on ScienceQA

Language Modelling Science Question Answering

3,672

Paper
Code

Parameter-Efficient Fine-Tuning Design Spaces

no code implementations • 4 Jan 2023 • Jiaao Chen, Aston Zhang, Xingjian Shi, Mu Li, Alex Smola, Diyi Yang

We discover the following design patterns: (i) group layers in a spindle pattern; (ii) allocate the number of trainable parameters to layers uniformly; (iii) tune all the groups; (iv) assign proper tuning strategies to different groups.

Paper
Add Code

Learning Multimodal Data Augmentation in Feature Space

1 code implementation • 29 Dec 2022 • Zichang Liu, Zhiqiang Tang, Xingjian Shi, Aston Zhang, Mu Li, Anshumali Shrivastava, Andrew Gordon Wilson

The ability to jointly learn from multiple modalities, such as text, audio, and visual data, is a defining feature of intelligent systems.

Data Augmentation Image Classification +1

Paper
Code

SPT: Semi-Parametric Prompt Tuning for Multitask Prompted Learning

no code implementations • 21 Dec 2022 • M Saiful Bari, Aston Zhang, Shuai Zheng, Xingjian Shi, Yi Zhu, Shafiq Joty, Mu Li

Pre-trained large language models can efficiently interpolate human-written prompts in a natural way.

Language Modelling Zero-shot Generalization

Paper
Add Code

SMILE: Scaling Mixture-of-Experts with Efficient Bi-level Routing

no code implementations • 10 Dec 2022 • Chaoyang He, Shuai Zheng, Aston Zhang, George Karypis, Trishul Chilimbi, Mahdi Soltanolkotabi, Salman Avestimehr

The mixture of Expert (MoE) parallelism is a recent advancement that scales up the model size with constant computational cost.

Paper
Add Code

Trap and Replace: Defending Backdoor Attacks by Trapping Them into an Easy-to-Replace Subnetwork

1 code implementation • 12 Oct 2022 • Haotao Wang, Junyuan Hong, Aston Zhang, Jiayu Zhou, Zhangyang Wang

As a result, both the stem and the classification head in the final network are hardly affected by backdoor training samples.

backdoor defense Classification +1

Paper
Code

Automatic Chain of Thought Prompting in Large Language Models

5 code implementations • 7 Oct 2022 • Zhuosheng Zhang, Aston Zhang, Mu Li, Alex Smola

Providing these steps for prompting demonstrations is called chain-of-thought (CoT) prompting.

17,332

Paper
Code

Removing Batch Normalization Boosts Adversarial Training

1 code implementation • 4 Jul 2022 • Haotao Wang, Aston Zhang, Shuai Zheng, Xingjian Shi, Mu Li, Zhangyang Wang

In addition, NoFrost achieves a $23. 56\%$ adversarial robustness against PGD attack, which improves the $13. 57\%$ robustness in BN-based AT.

Adversarial Robustness

Paper
Code

Partial and Asymmetric Contrastive Learning for Out-of-Distribution Detection in Long-Tailed Recognition

1 code implementation • 4 Jul 2022 • Haotao Wang, Aston Zhang, Yi Zhu, Shuai Zheng, Mu Li, Alex Smola, Zhangyang Wang

However, in real-world applications, it is common for the training sets to have long-tailed distributions.

Anomaly Detection Contrastive Learning +2

Paper
Code

MixGen: A New Multi-Modal Data Augmentation

1 code implementation • 16 Jun 2022 • Xiaoshuai Hao, Yi Zhu, Srikar Appalaraju, Aston Zhang, Wanqian Zhang, Bo Li, Mu Li

Data augmentation is a necessity to enhance data efficiency in deep learning.

Data Augmentation Question Answering +7

104

Paper
Code

Self-Instantiated Recurrent Units with Dynamic Soft Recursion

no code implementations • NeurIPS 2021 • Aston Zhang, Yi Tay, Yikang Shen, Alvin Chan Guo Wei, Shuai Zhang

On the other hand, the extent of the Self-IRU recursion is controlled by gates whose values are between 0 and 1 and may vary across the temporal dimension of sequences, enabling dynamic soft recursion depth at each time step.

Inductive Bias

Paper
Add Code

PHNNs: Lightweight Neural Networks via Parameterized Hypercomplex Convolutions

4 code implementations • 8 Oct 2021 • Eleonora Grassucci, Aston Zhang, Danilo Comminiello

In this paper, we define the parameterization of hypercomplex convolutional layers and introduce the family of parameterized hypercomplex neural networks (PHNNs) that are lightweight and efficient large-scale models.

Ranked #1 on Sound Event Detection on L3DAS21

Sound Event Detection

Paper
Code

On Orthogonality Constraints for Transformers

no code implementations • ACL 2021 • Aston Zhang, Alvin Chan, Yi Tay, Jie Fu, Shuohang Wang, Shuai Zhang, Huajie Shao, Shuochao Yao, Roy Ka-Wei Lee

Orthogonality constraints encourage matrices to be orthogonal for numerical stability.

Dialogue Generation Machine Translation +1

Paper
Add Code

Dive into Deep Learning

1 code implementation • 21 Jun 2021 • Aston Zhang, Zachary C. Lipton, Mu Li, Alexander J. Smola

This open-source book represents our attempt to make deep learning approachable, teaching readers the concepts, the context, and the code.

Math Multi-Domain Recommender Systems

21,708

Paper
Code

Controllable and Diverse Text Generation in E-commerce

no code implementations • 23 Feb 2021 • Huajie Shao, Jun Wang, Haohong Lin, Xuezhou Zhang, Aston Zhang, Heng Ji, Tarek Abdelzaher

The algorithm is injected into a Conditional Variational Autoencoder (CVAE), allowing \textit{Apex} to control both (i) the order of keywords in the generated sentences (conditioned on the input keywords and their order), and (ii) the trade-off between diversity and accuracy.

Text Generation

Paper
Add Code

Beyond Fully-Connected Layers with Quaternions: Parameterization of Hypercomplex Multiplications with $1/n$ Parameters

3 code implementations • 17 Feb 2021 • Aston Zhang, Yi Tay, Shuai Zhang, Alvin Chan, Anh Tuan Luu, Siu Cheung Hui, Jie Fu

Recent works have demonstrated reasonable success of representation learning in hypercomplex space.

Machine Translation Natural Language Inference +4

Paper
Code

A Unified Lottery Ticket Hypothesis for Graph Neural Networks

2 code implementations • 12 Feb 2021 • Tianlong Chen, Yongduo Sui, Xuxi Chen, Aston Zhang, Zhangyang Wang

With graphs rapidly growing in size and deeper graph neural networks (GNNs) emerging, the training and inference of GNNs become increasingly expensive.

Link Prediction Node Classification

Paper
Code

Recurrently Controlling a Recurrent Network with Recurrent Networks Controlled by More Recurrent Networks

no code implementations • 1 Jan 2021 • Yi Tay, Yikang Shen, Alvin Chan, Aston Zhang, Shuai Zhang

This paper explores an intriguing idea of recursively parameterizing recurrent nets.

Code Generation Inductive Bias +4

Paper
Add Code

Parameterization of Hypercomplex Multiplications

no code implementations • ICLR 2021 • Aston Zhang, Yi Tay, Shuai Zhang, Alvin Chan, Anh Tuan Luu, Siu Hui, Jie Fu

Recent works have demonstrated reasonable success of representation learning in hypercomplex space.

Machine Translation Natural Language Inference +4

Paper
Add Code

Learning User Representations with Hypercuboids for Recommender Systems

3 code implementations • 11 Nov 2020 • Shuai Zhang, Huoyu Liu, Aston Zhang, Yue Hu, Ce Zhang, Yumeng Li, Tanchao Zhu, Shaojian He, Wenwu Ou

Furthermore, we present two variants of hypercuboids to enhance the capability in capturing the diversities of user interests.

Collaborative Filtering Recommendation Systems

Paper
Code

ControlVAE: Tuning, Analytical Properties, and Performance Analysis

4 code implementations • 31 Oct 2020 • Huajie Shao, Zhisheng Xiao, Shuochao Yao, Aston Zhang, Shengzhong Liu, Tarek Abdelzaher

ControlVAE is a new variational autoencoder (VAE) framework that combines the automatic control theory with the basic VAE to stabilize the KL-divergence of VAE models to a specified value.

Disentanglement Image Generation +1

21,708

Paper
Code

Text Style Transfer: A Review and Experimental Evaluation

2 code implementations • 24 Oct 2020 • Zhiqiang Hu, Roy Ka-Wei Lee, Charu C. Aggarwal, Aston Zhang

This article aims to provide a comprehensive review of recent research efforts on text style transfer.

Style Transfer Text Style Transfer

1,589

Paper
Code

Poison Attacks against Text Datasets with Conditional Adversarially Regularized Autoencoder

2 code implementations • Findings of the Association for Computational Linguistics 2020 • Alvin Chan, Yi Tay, Yew-Soon Ong, Aston Zhang

This paper demonstrates a fatal vulnerability in natural language inference (NLI) and text classification systems.

General Classification Natural Language Inference +2

Paper
Code

CoCon: A Self-Supervised Approach for Controlled Text Generation

1 code implementation • ICLR 2021 • Alvin Chan, Yew-Soon Ong, Bill Pung, Aston Zhang, Jie Fu

While there are studies that seek to control high-level attributes (such as sentiment and topic) of generated text, there is still a lack of more precise control over its content at the word- and phrase-level.

Text Generation

Paper
Code

paper2repo: GitHub Repository Recommendation for Academic Papers

no code implementations • 13 Apr 2020 • Huajie Shao, Dachun Sun, Jiahao Wu, Zecheng Zhang, Aston Zhang, Shuochao Yao, Shengzhong Liu, Tianshi Wang, Chao Zhang, Tarek Abdelzaher

Motivated by this trend, we describe a novel item-item cross-platform recommender system, $\textit{paper2repo}$, that recommends relevant repositories on GitHub that match a given paper in an academic search system such as Microsoft Academic.

Recommendation Systems

Paper
Add Code

ControlVAE: Controllable Variational Autoencoder

no code implementations • ICML 2020 • Huajie Shao, Shuochao Yao, Dachun Sun, Aston Zhang, Shengzhong Liu, Dongxin Liu, Jun Wang, Tarek Abdelzaher

Variational Autoencoders (VAE) and their variants have been widely used in a variety of applications, such as dialog generation, image generation and disentangled representation learning.

Image Generation Language Modelling +1

Paper
Add Code

Transformer on a Diet

1 code implementation • 14 Feb 2020 • Chenguang Wang, Zihao Ye, Aston Zhang, Zheng Zhang, Alexander J. Smola

Transformer has been widely used thanks to its ability to capture sequence information in an efficient way.

Language Modelling

Paper
Code

Compositional De-Attention Networks

no code implementations • NeurIPS 2019 • Yi Tay, Anh Tuan Luu, Aston Zhang, Shuohang Wang, Siu Cheung Hui

Attentional models are distinctly characterized by their ability to learn relative importance, i. e., assigning a different weight to input values.

Machine Translation Natural Language Inference +4

Paper
Add Code

R2D2: Reuse & Reduce via Dynamic Weight Diffusion for Training Efficient NLP Models

no code implementations • 25 Sep 2019 • Yi Tay, Aston Zhang, Shuai Zhang, Alvin Chan, Luu Anh Tuan, Siu Cheung Hui

We propose R2D2 layers, a new neural block for training efficient NLP models.

Paper
Add Code

Parsimonious Morpheme Segmentation with an Application to Enriching Word Embeddings

no code implementations • 18 Aug 2019 • Ahmed El-Kishky, Frank Xu, Aston Zhang, Jiawei Han

However, in many languages and specialized corpora, words are composed by concatenating semantically meaningful subword structures.

Language Modelling Segmentation +1

Paper
Add Code

GluonCV and GluonNLP: Deep Learning in Computer Vision and Natural Language Processing

4 code implementations • 9 Jul 2019 • Jian Guo, He He, Tong He, Leonard Lausen, Mu Li, Haibin Lin, Xingjian Shi, Chenguang Wang, Junyuan Xie, Sheng Zha, Aston Zhang, Hang Zhang, Zhi Zhang, Zhongyue Zhang, Shuai Zheng, Yi Zhu

We present GluonCV and GluonNLP, the deep learning toolkits for computer vision and natural language processing based on Apache MXNet (incubating).

2,548

Paper
Code

G-PATE: Scalable Differentially Private Data Generator via Private Aggregation of Teacher Discriminators

2 code implementations • NeurIPS 2021 • Yunhui Long, Boxin Wang, Zhuolin Yang, Bhavya Kailkhura, Aston Zhang, Carl A. Gunter, Bo Li

In particular, we train a student data generator with an ensemble of teacher discriminators and propose a novel private gradient aggregation mechanism to ensure differential privacy on all information that flows from teacher discriminators to the student generator.

BIG-bench Machine Learning Privacy Preserving

Paper
Code

Lightweight and Efficient Neural Natural Language Processing with Quaternion Networks

1 code implementation • ACL 2019 • Yi Tay, Aston Zhang, Luu Anh Tuan, Jinfeng Rao, Shuai Zhang, Shuohang Wang, Jie Fu, Siu Cheung Hui

Many state-of-the-art neural models for NLP are heavily parameterized and thus memory inefficient.

Paper
Code

Quaternion Collaborative Filtering for Recommendation

no code implementations • 6 Jun 2019 • Shuai Zhang, Lina Yao, Lucas Vinh Tran, Aston Zhang, Yi Tay

All in all, we conduct extensive experiments on six real-world datasets, demonstrating the effectiveness of Quaternion algebra in recommender systems.

Collaborative Filtering Inductive Bias +2

Paper
Add Code

Simple and Effective Curriculum Pointer-Generator Networks for Reading Comprehension over Long Narratives

no code implementations • ACL 2019 • Yi Tay, Shuohang Wang, Luu Anh Tuan, Jie Fu, Minh C. Phan, Xingdi Yuan, Jinfeng Rao, Siu Cheung Hui, Aston Zhang

This paper tackles the problem of reading comprehension over long narratives where documents easily span over thousands of tokens.

Reading Comprehension

Paper
Add Code

Entropy-Based Subword Mining with an Application to Word Embeddings

no code implementations • WS 2018 • Ahmed El-Kishky, Frank Xu, Aston Zhang, Stephen Macke, Jiawei Han

Recent literature has shown a wide variety of benefits to mapping traditional one-hot representations of words and phrases to lower-dimensional real-valued vectors known as word embeddings.

Language Modelling Machine Translation +3

Paper
Add Code

Expert Finding in Heterogeneous Bibliographic Networks with Locally-trained Embeddings

no code implementations • 9 Mar 2018 • Huan Gui, Qi Zhu, Liyuan Liu, Aston Zhang, Jiawei Han

We study the task of expert finding in heterogeneous bibliographical networks based on two aspects: textual content analysis and authority ranking.

Paper
Add Code

RDeepSense: Reliable Deep Mobile Computing Models with Uncertainty Estimations

no code implementations • 9 Sep 2017 • Shuochao Yao, Yiran Zhao, Huajie Shao, Aston Zhang, Chao Zhang, Shen Li, Tarek Abdelzaher

Recent advances in deep learning have led various applications to unprecedented achievements, which could potentially bring higher intelligence to a broad spectrum of mobile and ubiquitous applications.

Paper
Add Code

DeepIoT: Compressing Deep Neural Network Structures for Sensing Systems with a Compressor-Critic Framework

1 code implementation • 5 Jun 2017 • Shuochao Yao, Yiran Zhao, Aston Zhang, Lu Su, Tarek Abdelzaher

It is thus able to shorten execution time by 71. 4% to 94. 5%, and decrease energy consumption by 72. 2% to 95. 7%.

Paper
Code

DeepSense: A Unified Deep Learning Framework for Time-Series Mobile Sensing Data Processing

1 code implementation • 7 Nov 2016 • Shuochao Yao, Shaohan Hu, Yiran Zhao, Aston Zhang, Tarek Abdelzaher

For many mobile applications, it is hard to find a distribution that exactly describes the noise in practice.

General Classification Human Activity Recognition +2

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.