Search Results for author: Shaokun Zhang

Found 10 papers, 5 papers with code

StateFlow: Enhancing LLM Task-Solving through State-Driven Workflows

2 code implementations • 17 Mar 2024 • Yiran Wu, Tianwei Yue, Shaokun Zhang, Chi Wang, Qingyun Wu

In StateFlow, we distinguish between "process grounding" (via state and state transitions) and "sub-task solving" (through actions within a state), enhancing control and interpretability of the task-solving procedure.

Management

Paper
Code

Training Language Model Agents without Modifying Language Models

no code implementations • 17 Feb 2024 • Shaokun Zhang, Jieyu Zhang, Jiale Liu, Linxin Song, Chi Wang, Ranjay Krishna, Qingyun Wu

Researchers and practitioners have recently reframed powerful Large Language Models (LLMs) as agents, enabling them to automate complex tasks largely via the use of specialized functions.

Language Modelling

Paper
Add Code

Refined Coreset Selection: Towards Minimal Coreset Size under Model Performance Constraints

no code implementations • 15 Nov 2023 • Xiaobo Xia, Jiale Liu, Shaokun Zhang, Qingyun Wu, Hongxin Wei, Tongliang Liu

Coreset selection is powerful in reducing computational costs and accelerating data processing for deep learning algorithms.

Paper
Add Code

IDEAL: Influence-Driven Selective Annotations Empower In-Context Learners in Large Language Models

no code implementations • 16 Oct 2023 • Shaokun Zhang, Xiaobo Xia, Zhaoqing Wang, Ling-Hao Chen, Jiale Liu, Qingyun Wu, Tongliang Liu

However, since the prompts need to be sampled from a large volume of annotated examples, finding the right prompt may result in high annotation costs.

In-Context Learning

Paper
Add Code

AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation

1 code implementation • 16 Aug 2023 • Qingyun Wu, Gagan Bansal, Jieyu Zhang, Yiran Wu, Beibin Li, Erkang Zhu, Li Jiang, Xiaoyun Zhang, Shaokun Zhang, Jiale Liu, Ahmed Hassan Awadallah, Ryen W White, Doug Burger, Chi Wang

AutoGen is an open-source framework that allows developers to build LLM applications via multiple agents that can converse with each other to accomplish tasks.

3D Human Pose Estimation College Computer Science +2

25,127

Paper
Code

An Empirical Study on Challenging Math Problem Solving with GPT-4

1 code implementation • 2 Jun 2023 • Yiran Wu, Feiran Jia, Shaokun Zhang, Hangyu Li, Erkang Zhu, Yue Wang, Yin Tat Lee, Richard Peng, Qingyun Wu, Chi Wang

Employing Large Language Models (LLMs) to address mathematical problems is an intriguing research endeavor, considering the abundance of math problems expressed in natural language across numerous science and engineering fields.

Elementary Mathematics Math

25,127

Paper
Code

HyperTime: Hyperparameter Optimization for Combating Temporal Distribution Shifts

no code implementations • 28 May 2023 • Shaokun Zhang, Yiran Wu, Zhonghua Zheng, Qingyun Wu, Chi Wang

In this work, we propose a hyperparameter optimization method named \emph{HyperTime} to find hyperparameters robust to potential temporal distribution shifts in the unseen test data.

Hyperparameter Optimization Philosophy

Paper
Add Code

You Only Compress Once: Towards Effective and Elastic BERT Compression via Exploit-Explore Stochastic Nature Gradient

1 code implementation • 4 Jun 2021 • Shaokun Zhang, Xiawu Zheng, Chenyi Yang, Yuchao Li, Yan Wang, Fei Chao, Mengdi Wang, Shen Li, Jun Yang, Rongrong Ji

Motivated by the necessity of efficient inference across various constraints on BERT, we propose a novel approach, YOCO-BERT, to achieve compress once and deploy everywhere.

AutoML Model Compression

Paper
Code

DDPNAS: Efficient Neural Architecture Search via Dynamic Distribution Pruning

1 code implementation • 28 May 2019 • Xiawu Zheng, Chenyi Yang, Shaokun Zhang, Yan Wang, Baochang Zhang, Yongjian Wu, Yunsheng Wu, Ling Shao, Rongrong Ji

With the proposed efficient network generation method, we directly obtain the optimal neural architectures on given constraints, which is practical for on-device models across diverse search spaces and constraints.

Neural Architecture Search

Paper
Code

Layerwise Recurrent Autoencoder for General Real-world Traffic Flow Forecasting

no code implementations • 27 Sep 2018 • Peize Zhao, Danfeng Cai, Shaokun Zhang, Feng Chen, Zhemin Zhang, Cheng Wang, Jonathan Li

To forecast the traffic flow across a wide area and overcome the mentioned challenges, we design and propose a promising forecasting model called Layerwise Recurrent Autoencoder (LRA), in which a three-layer stacked autoencoder (SAE) architecture is used to obtain temporal traffic correlations and a recurrent neural networks (RNNs) model for prediction.

Management

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.