Search Results for author: Shaokun Zhang

Found 10 papers, 5 papers with code

StateFlow: Enhancing LLM Task-Solving through State-Driven Workflows

2 code implementations17 Mar 2024 Yiran Wu, Tianwei Yue, Shaokun Zhang, Chi Wang, Qingyun Wu

In StateFlow, we distinguish between "process grounding" (via state and state transitions) and "sub-task solving" (through actions within a state), enhancing control and interpretability of the task-solving procedure.

Management

Training Language Model Agents without Modifying Language Models

no code implementations17 Feb 2024 Shaokun Zhang, Jieyu Zhang, Jiale Liu, Linxin Song, Chi Wang, Ranjay Krishna, Qingyun Wu

Researchers and practitioners have recently reframed powerful Large Language Models (LLMs) as agents, enabling them to automate complex tasks largely via the use of specialized functions.

Language Modelling

Refined Coreset Selection: Towards Minimal Coreset Size under Model Performance Constraints

no code implementations15 Nov 2023 Xiaobo Xia, Jiale Liu, Shaokun Zhang, Qingyun Wu, Hongxin Wei, Tongliang Liu

Coreset selection is powerful in reducing computational costs and accelerating data processing for deep learning algorithms.

IDEAL: Influence-Driven Selective Annotations Empower In-Context Learners in Large Language Models

no code implementations16 Oct 2023 Shaokun Zhang, Xiaobo Xia, Zhaoqing Wang, Ling-Hao Chen, Jiale Liu, Qingyun Wu, Tongliang Liu

However, since the prompts need to be sampled from a large volume of annotated examples, finding the right prompt may result in high annotation costs.

In-Context Learning

An Empirical Study on Challenging Math Problem Solving with GPT-4

1 code implementation2 Jun 2023 Yiran Wu, Feiran Jia, Shaokun Zhang, Hangyu Li, Erkang Zhu, Yue Wang, Yin Tat Lee, Richard Peng, Qingyun Wu, Chi Wang

Employing Large Language Models (LLMs) to address mathematical problems is an intriguing research endeavor, considering the abundance of math problems expressed in natural language across numerous science and engineering fields.

Elementary Mathematics Math

HyperTime: Hyperparameter Optimization for Combating Temporal Distribution Shifts

no code implementations28 May 2023 Shaokun Zhang, Yiran Wu, Zhonghua Zheng, Qingyun Wu, Chi Wang

In this work, we propose a hyperparameter optimization method named \emph{HyperTime} to find hyperparameters robust to potential temporal distribution shifts in the unseen test data.

Hyperparameter Optimization Philosophy

You Only Compress Once: Towards Effective and Elastic BERT Compression via Exploit-Explore Stochastic Nature Gradient

1 code implementation4 Jun 2021 Shaokun Zhang, Xiawu Zheng, Chenyi Yang, Yuchao Li, Yan Wang, Fei Chao, Mengdi Wang, Shen Li, Jun Yang, Rongrong Ji

Motivated by the necessity of efficient inference across various constraints on BERT, we propose a novel approach, YOCO-BERT, to achieve compress once and deploy everywhere.

AutoML Model Compression

DDPNAS: Efficient Neural Architecture Search via Dynamic Distribution Pruning

1 code implementation28 May 2019 Xiawu Zheng, Chenyi Yang, Shaokun Zhang, Yan Wang, Baochang Zhang, Yongjian Wu, Yunsheng Wu, Ling Shao, Rongrong Ji

With the proposed efficient network generation method, we directly obtain the optimal neural architectures on given constraints, which is practical for on-device models across diverse search spaces and constraints.

Neural Architecture Search

Layerwise Recurrent Autoencoder for General Real-world Traffic Flow Forecasting

no code implementations27 Sep 2018 Peize Zhao, Danfeng Cai, Shaokun Zhang, Feng Chen, Zhemin Zhang, Cheng Wang, Jonathan Li

To forecast the traffic flow across a wide area and overcome the mentioned challenges, we design and propose a promising forecasting model called Layerwise Recurrent Autoencoder (LRA), in which a three-layer stacked autoencoder (SAE) architecture is used to obtain temporal traffic correlations and a recurrent neural networks (RNNs) model for prediction.

Management

Cannot find the paper you are looking for? You can Submit a new open access paper.