Search Results for author: Aston Zhang

Found 48 papers, 29 papers with code

Igniting Language Intelligence: The Hitchhiker's Guide From Chain-of-Thought Reasoning to Language Agents

1 code implementation20 Nov 2023 Zhuosheng Zhang, Yao Yao, Aston Zhang, Xiangru Tang, Xinbei Ma, Zhiwei He, Yiming Wang, Mark Gerstein, Rui Wang, Gongshen Liu, Hai Zhao

Large language models (LLMs) have dramatically enhanced the field of language intelligence, as demonstrably evidenced by their formidable empirical performance across a spectrum of complex reasoning tasks.

In-Context Learning with Iterative Demonstration Selection

no code implementations15 Oct 2023 Chengwei Qin, Aston Zhang, Anirudh Dagar, Wenming Ye

The output reasoning path is then used to choose demonstrations that are prepended to the test sample for inference.

Few-Shot Learning In-Context Learning +3

You Only Look at Screens: Multimodal Chain-of-Action Agents

2 code implementations20 Sep 2023 Zhuosheng Zhang, Aston Zhang

Autonomous user interface (UI) agents aim to facilitate task automation by interacting with the user interface without manual intervention.

Type prediction

Automated Few-shot Classification with Instruction-Finetuned Language Models

1 code implementation21 May 2023 Rami Aly, Xingjian Shi, Kaixiang Lin, Aston Zhang, Andrew Gordon Wilson

We observe, in the context of classification tasks, that instruction finetuned language models exhibit remarkable prompt robustness, and we subsequently propose a simple method to eliminate the need for handcrafted prompts, named AuT-Few.

Classification Few-Shot Learning +1

A Cheaper and Better Diffusion Language Model with Soft-Masked Noise

1 code implementation10 Apr 2023 Jiaao Chen, Aston Zhang, Mu Li, Alex Smola, Diyi Yang

Diffusion models that are based on iterative denoising have been recently proposed and leveraged in various generation tasks like image generation.

Denoising Image Generation +1

Is ChatGPT a General-Purpose Natural Language Processing Task Solver?

1 code implementation8 Feb 2023 Chengwei Qin, Aston Zhang, Zhuosheng Zhang, Jiaao Chen, Michihiro Yasunaga, Diyi Yang

Spurred by advancements in scale, large language models (LLMs) have demonstrated the ability to perform a variety of natural language processing (NLP) tasks zero-shot -- i. e., without adaptation on downstream data.

Arithmetic Reasoning Zero-Shot Learning

AIM: Adapting Image Models for Efficient Video Action Recognition

1 code implementation6 Feb 2023 Taojiannan Yang, Yi Zhu, Yusheng Xie, Aston Zhang, Chen Chen, Mu Li

Recent vision transformer based video models mostly follow the ``image pre-training then finetuning" paradigm and have achieved great success on multiple video benchmarks.

 Ranked #1 on Action Recognition on Diving-48 (using extra training data)

Action Classification Action Recognition +2

Multimodal Chain-of-Thought Reasoning in Language Models

3 code implementations2 Feb 2023 Zhuosheng Zhang, Aston Zhang, Mu Li, Hai Zhao, George Karypis, Alex Smola

Large language models (LLMs) have shown impressive performance on complex reasoning by leveraging chain-of-thought (CoT) prompting to generate intermediate reasoning chains as the rationale to infer the answer.

Language Modelling Science Question Answering

Parameter-Efficient Fine-Tuning Design Spaces

no code implementations4 Jan 2023 Jiaao Chen, Aston Zhang, Xingjian Shi, Mu Li, Alex Smola, Diyi Yang

We discover the following design patterns: (i) group layers in a spindle pattern; (ii) allocate the number of trainable parameters to layers uniformly; (iii) tune all the groups; (iv) assign proper tuning strategies to different groups.

Learning Multimodal Data Augmentation in Feature Space

1 code implementation29 Dec 2022 Zichang Liu, Zhiqiang Tang, Xingjian Shi, Aston Zhang, Mu Li, Anshumali Shrivastava, Andrew Gordon Wilson

The ability to jointly learn from multiple modalities, such as text, audio, and visual data, is a defining feature of intelligent systems.

Data Augmentation Image Classification +1

SMILE: Scaling Mixture-of-Experts with Efficient Bi-level Routing

no code implementations10 Dec 2022 Chaoyang He, Shuai Zheng, Aston Zhang, George Karypis, Trishul Chilimbi, Mahdi Soltanolkotabi, Salman Avestimehr

The mixture of Expert (MoE) parallelism is a recent advancement that scales up the model size with constant computational cost.

Trap and Replace: Defending Backdoor Attacks by Trapping Them into an Easy-to-Replace Subnetwork

1 code implementation12 Oct 2022 Haotao Wang, Junyuan Hong, Aston Zhang, Jiayu Zhou, Zhangyang Wang

As a result, both the stem and the classification head in the final network are hardly affected by backdoor training samples.

backdoor defense Classification +1

Automatic Chain of Thought Prompting in Large Language Models

5 code implementations7 Oct 2022 Zhuosheng Zhang, Aston Zhang, Mu Li, Alex Smola

Providing these steps for prompting demonstrations is called chain-of-thought (CoT) prompting.

Removing Batch Normalization Boosts Adversarial Training

1 code implementation4 Jul 2022 Haotao Wang, Aston Zhang, Shuai Zheng, Xingjian Shi, Mu Li, Zhangyang Wang

In addition, NoFrost achieves a $23. 56\%$ adversarial robustness against PGD attack, which improves the $13. 57\%$ robustness in BN-based AT.

Adversarial Robustness

Self-Instantiated Recurrent Units with Dynamic Soft Recursion

no code implementations NeurIPS 2021 Aston Zhang, Yi Tay, Yikang Shen, Alvin Chan Guo Wei, Shuai Zhang

On the other hand, the extent of the Self-IRU recursion is controlled by gates whose values are between 0 and 1 and may vary across the temporal dimension of sequences, enabling dynamic soft recursion depth at each time step.

Inductive Bias

PHNNs: Lightweight Neural Networks via Parameterized Hypercomplex Convolutions

4 code implementations8 Oct 2021 Eleonora Grassucci, Aston Zhang, Danilo Comminiello

In this paper, we define the parameterization of hypercomplex convolutional layers and introduce the family of parameterized hypercomplex neural networks (PHNNs) that are lightweight and efficient large-scale models.

Sound Event Detection

Dive into Deep Learning

1 code implementation21 Jun 2021 Aston Zhang, Zachary C. Lipton, Mu Li, Alexander J. Smola

This open-source book represents our attempt to make deep learning approachable, teaching readers the concepts, the context, and the code.

Math Multi-Domain Recommender Systems

Controllable and Diverse Text Generation in E-commerce

no code implementations23 Feb 2021 Huajie Shao, Jun Wang, Haohong Lin, Xuezhou Zhang, Aston Zhang, Heng Ji, Tarek Abdelzaher

The algorithm is injected into a Conditional Variational Autoencoder (CVAE), allowing \textit{Apex} to control both (i) the order of keywords in the generated sentences (conditioned on the input keywords and their order), and (ii) the trade-off between diversity and accuracy.

Text Generation

A Unified Lottery Ticket Hypothesis for Graph Neural Networks

2 code implementations12 Feb 2021 Tianlong Chen, Yongduo Sui, Xuxi Chen, Aston Zhang, Zhangyang Wang

With graphs rapidly growing in size and deeper graph neural networks (GNNs) emerging, the training and inference of GNNs become increasingly expensive.

Link Prediction Node Classification

Learning User Representations with Hypercuboids for Recommender Systems

3 code implementations11 Nov 2020 Shuai Zhang, Huoyu Liu, Aston Zhang, Yue Hu, Ce Zhang, Yumeng Li, Tanchao Zhu, Shaojian He, Wenwu Ou

Furthermore, we present two variants of hypercuboids to enhance the capability in capturing the diversities of user interests.

Collaborative Filtering Recommendation Systems

ControlVAE: Tuning, Analytical Properties, and Performance Analysis

4 code implementations31 Oct 2020 Huajie Shao, Zhisheng Xiao, Shuochao Yao, Aston Zhang, Shengzhong Liu, Tarek Abdelzaher

ControlVAE is a new variational autoencoder (VAE) framework that combines the automatic control theory with the basic VAE to stabilize the KL-divergence of VAE models to a specified value.

Disentanglement Image Generation +1

Text Style Transfer: A Review and Experimental Evaluation

2 code implementations24 Oct 2020 Zhiqiang Hu, Roy Ka-Wei Lee, Charu C. Aggarwal, Aston Zhang

This article aims to provide a comprehensive review of recent research efforts on text style transfer.

Style Transfer Text Style Transfer

CoCon: A Self-Supervised Approach for Controlled Text Generation

1 code implementation ICLR 2021 Alvin Chan, Yew-Soon Ong, Bill Pung, Aston Zhang, Jie Fu

While there are studies that seek to control high-level attributes (such as sentiment and topic) of generated text, there is still a lack of more precise control over its content at the word- and phrase-level.

Text Generation

ControlVAE: Controllable Variational Autoencoder

no code implementations ICML 2020 Huajie Shao, Shuochao Yao, Dachun Sun, Aston Zhang, Shengzhong Liu, Dongxin Liu, Jun Wang, Tarek Abdelzaher

Variational Autoencoders (VAE) and their variants have been widely used in a variety of applications, such as dialog generation, image generation and disentangled representation learning.

Image Generation Language Modelling +1

paper2repo: GitHub Repository Recommendation for Academic Papers

no code implementations13 Apr 2020 Huajie Shao, Dachun Sun, Jiahao Wu, Zecheng Zhang, Aston Zhang, Shuochao Yao, Shengzhong Liu, Tianshi Wang, Chao Zhang, Tarek Abdelzaher

Motivated by this trend, we describe a novel item-item cross-platform recommender system, $\textit{paper2repo}$, that recommends relevant repositories on GitHub that match a given paper in an academic search system such as Microsoft Academic.

Recommendation Systems

Transformer on a Diet

1 code implementation14 Feb 2020 Chenguang Wang, Zihao Ye, Aston Zhang, Zheng Zhang, Alexander J. Smola

Transformer has been widely used thanks to its ability to capture sequence information in an efficient way.

Language Modelling

Compositional De-Attention Networks

no code implementations NeurIPS 2019 Yi Tay, Anh Tuan Luu, Aston Zhang, Shuohang Wang, Siu Cheung Hui

Attentional models are distinctly characterized by their ability to learn relative importance, i. e., assigning a different weight to input values.

Machine Translation Natural Language Inference +4

Parsimonious Morpheme Segmentation with an Application to Enriching Word Embeddings

no code implementations18 Aug 2019 Ahmed El-Kishky, Frank Xu, Aston Zhang, Jiawei Han

However, in many languages and specialized corpora, words are composed by concatenating semantically meaningful subword structures.

Language Modelling Segmentation +1

GluonCV and GluonNLP: Deep Learning in Computer Vision and Natural Language Processing

4 code implementations9 Jul 2019 Jian Guo, He He, Tong He, Leonard Lausen, Mu Li, Haibin Lin, Xingjian Shi, Chenguang Wang, Junyuan Xie, Sheng Zha, Aston Zhang, Hang Zhang, Zhi Zhang, Zhongyue Zhang, Shuai Zheng, Yi Zhu

We present GluonCV and GluonNLP, the deep learning toolkits for computer vision and natural language processing based on Apache MXNet (incubating).

G-PATE: Scalable Differentially Private Data Generator via Private Aggregation of Teacher Discriminators

2 code implementations NeurIPS 2021 Yunhui Long, Boxin Wang, Zhuolin Yang, Bhavya Kailkhura, Aston Zhang, Carl A. Gunter, Bo Li

In particular, we train a student data generator with an ensemble of teacher discriminators and propose a novel private gradient aggregation mechanism to ensure differential privacy on all information that flows from teacher discriminators to the student generator.

BIG-bench Machine Learning Privacy Preserving

Quaternion Collaborative Filtering for Recommendation

no code implementations6 Jun 2019 Shuai Zhang, Lina Yao, Lucas Vinh Tran, Aston Zhang, Yi Tay

All in all, we conduct extensive experiments on six real-world datasets, demonstrating the effectiveness of Quaternion algebra in recommender systems.

Collaborative Filtering Inductive Bias +2

Entropy-Based Subword Mining with an Application to Word Embeddings

no code implementations WS 2018 Ahmed El-Kishky, Frank Xu, Aston Zhang, Stephen Macke, Jiawei Han

Recent literature has shown a wide variety of benefits to mapping traditional one-hot representations of words and phrases to lower-dimensional real-valued vectors known as word embeddings.

Language Modelling Machine Translation +3

Expert Finding in Heterogeneous Bibliographic Networks with Locally-trained Embeddings

no code implementations9 Mar 2018 Huan Gui, Qi Zhu, Liyuan Liu, Aston Zhang, Jiawei Han

We study the task of expert finding in heterogeneous bibliographical networks based on two aspects: textual content analysis and authority ranking.

RDeepSense: Reliable Deep Mobile Computing Models with Uncertainty Estimations

no code implementations9 Sep 2017 Shuochao Yao, Yiran Zhao, Huajie Shao, Aston Zhang, Chao Zhang, Shen Li, Tarek Abdelzaher

Recent advances in deep learning have led various applications to unprecedented achievements, which could potentially bring higher intelligence to a broad spectrum of mobile and ubiquitous applications.

DeepIoT: Compressing Deep Neural Network Structures for Sensing Systems with a Compressor-Critic Framework

1 code implementation5 Jun 2017 Shuochao Yao, Yiran Zhao, Aston Zhang, Lu Su, Tarek Abdelzaher

It is thus able to shorten execution time by 71. 4% to 94. 5%, and decrease energy consumption by 72. 2% to 95. 7%.

Cannot find the paper you are looking for? You can Submit a new open access paper.