Search Results for author: Zihan Zhang

Found 55 papers, 16 papers with code

Enhancing the General Agent Capabilities of Low-Parameter LLMs through Tuning and Multi-Branch Reasoning

no code implementations29 Mar 2024 Qinhao Zhou, Zihan Zhang, Xiang Xiang, Ke Wang, Yuchuan Wu, Yongbin Li

As intelligent agents, LLMs need to have the capabilities of task planning, long-term memory, and the ability to leverage external tools to achieve satisfactory performance.

Hallucination

Horizon-Free Regret for Linear Markov Decision Processes

no code implementations15 Mar 2024 Zihan Zhang, Jason D. Lee, Yuxin Chen, Simon S. Du

A recent line of works showed regret bounds in reinforcement learning (RL) can be (nearly) independent of planning horizon, a. k. a.~the horizon-free bounds.

LEMMA Reinforcement Learning (RL)

ResLoRA: Identity Residual Mapping in Low-Rank Adaption

1 code implementation28 Feb 2024 Shuhua Shi, Shaohan Huang, Minghui Song, Zhoujun Li, Zihan Zhang, Haizhen Huang, Furu Wei, Weiwei Deng, Feng Sun, Qi Zhang

As one of the most popular parameter-efficient fine-tuning (PEFT) methods, low-rank adaptation (LoRA) is commonly applied to fine-tune large language models (LLMs).

RetrievalQA: Assessing Adaptive Retrieval-Augmented Generation for Short-form Open-Domain Question Answering

1 code implementation26 Feb 2024 Zihan Zhang, Meng Fang, Ling Chen

Based on our findings, we propose Time-Aware Adaptive Retrieval (TA-ARE), a simple yet effective method that helps LLMs assess the necessity of retrieval without calibration or additional training.

Open-Domain Question Answering Retrieval

We Choose to Go to Space: Agent-driven Human and Multi-Robot Collaboration in Microgravity

no code implementations22 Feb 2024 Miao Xin, Zhongrui You, Zihan Zhang, Taoran Jiang, Tingjia Xu, Haotian Liang, Guojing Ge, Yuchen Ji, Shentong Mo, Jian Cheng

We present SpaceAgents-1, a system for learning human and multi-robot collaboration (HMRC) strategies under microgravity conditions.

Decision Making

Text Diffusion with Reinforced Conditioning

no code implementations19 Feb 2024 Yuxuan Liu, Tianchi Yang, Shaohan Huang, Zihan Zhang, Haizhen Huang, Furu Wei, Weiwei Deng, Feng Sun, Qi Zhang

Diffusion models have demonstrated exceptional capability in generating high-quality images, videos, and audio.

Improving Domain Adaptation through Extended-Text Reading Comprehension

1 code implementation14 Jan 2024 Ting Jiang, Shaohan Huang, Shengyue Luo, Zihan Zhang, Haizhen Huang, Furu Wei, Weiwei Deng, Feng Sun, Qi Zhang, Deqing Wang, Fuzhen Zhuang

To enhance the domain-specific capabilities of large language models, continued pre-training on a domain-specific corpus is a prevalent method.

Clustering Domain Adaptation +1

SELM: Speech Enhancement Using Discrete Tokens and Language Models

no code implementations15 Dec 2023 Ziqian Wang, Xinfa Zhu, Zihan Zhang, YuanJun Lv, Ning Jiang, Guoqing Zhao, Lei Xie

Given the intrinsic similarity between speech generation and speech enhancement, harnessing semantic information holds potential advantages for speech enhancement tasks.

Self-Supervised Learning Speech Enhancement

Optimal Multi-Distribution Learning

no code implementations8 Dec 2023 Zihan Zhang, Wenhao Zhan, Yuxin Chen, Simon S. Du, Jason D. Lee

Focusing on a hypothesis class of Vapnik-Chervonenkis (VC) dimension $d$, we propose a novel algorithm that yields an $varepsilon$-optimal randomized hypothesis with a sample complexity on the order of $(d+k)/\varepsilon^2$ (modulo some logarithmic factor), matching the best-known lower bound.

Fairness

OSM vs HD Maps: Map Representations for Trajectory Prediction

no code implementations4 Nov 2023 Jing-Yan Liao, Parth Doshi, Zihan Zhang, David Paz, Henrik Christensen

While High Definition (HD) Maps have long been favored for their precise depictions of static road elements, their accessibility constraints and susceptibility to rapid environmental changes impede the widespread deployment of autonomous driving, especially in the motion forecasting task.

Motion Forecasting Trajectory Prediction

CITB: A Benchmark for Continual Instruction Tuning

1 code implementation23 Oct 2023 Zihan Zhang, Meng Fang, Ling Chen, Mohammad-Reza Namazi-Rad

In this work, we establish a CIT benchmark consisting of learning and evaluation protocols.

Continual Learning

Democratizing Reasoning Ability: Tailored Learning from Large Language Model

1 code implementation20 Oct 2023 Zhaoyang Wang, Shaohan Huang, Yuxuan Liu, Jiahai Wang, Minghui Song, Zihan Zhang, Haizhen Huang, Furu Wei, Weiwei Deng, Feng Sun, Qi Zhang

In this paper, we propose a tailored learning approach to distill such reasoning ability to smaller LMs to facilitate the democratization of the exclusive reasoning ability.

Instruction Following Language Modelling +1

Auto Search Indexer for End-to-End Document Retrieval

no code implementations19 Oct 2023 Tianchi Yang, Minghui Song, Zihan Zhang, Haizhen Huang, Weiwei Deng, Feng Sun, Qi Zhang

Generative retrieval, which is a new advanced paradigm for document retrieval, has recently attracted research interests, since it encodes all documents into the model and directly generates the retrieved documents.

Retrieval

How Do Large Language Models Capture the Ever-changing World Knowledge? A Review of Recent Advances

1 code implementation11 Oct 2023 Zihan Zhang, Meng Fang, Ling Chen, Mohammad-Reza Namazi-Rad, Jun Wang

Although large language models (LLMs) are impressive in solving various tasks, they can quickly be outdated after deployment.

World Knowledge

An Exploration of Task-decoupling on Two-stage Neural Post Filter for Real-time Personalized Acoustic Echo Cancellation

no code implementations7 Oct 2023 Zihan Zhang, Jiayao Sun, Xianjun Xia, Ziqian Wang, Xiaopeng Yan, Yijian Xiao, Lei Xie

Utilization of speaker representation has extended the frontier of AEC, thus attracting many researchers' interest in personalized acoustic echo cancellation (PAEC).

Acoustic echo cancellation Speech Enhancement

Calibrating LLM-Based Evaluator

no code implementations23 Sep 2023 Yuxuan Liu, Tianchi Yang, Shaohan Huang, Zihan Zhang, Haizhen Huang, Furu Wei, Weiwei Deng, Feng Sun, Qi Zhang

Recent advancements in large language models (LLMs) on language modeling and emergent capabilities make them a promising reference-free evaluator of natural language generation quality, and a competent alternative to human evaluation.

In-Context Learning Language Modelling +1

Universal scaling relation and criticality in metabolism and growth of Escherichia coli

no code implementations9 Aug 2023 Shaohua Guan, Zhichao Zhang, Zihan Zhang, Hualin Shi

The metabolic network plays a crucial role in regulating bacterial metabolism and growth, but it is subject to inherent molecular stochasticity.

Relation

Classification with Deep Neural Networks and Logistic Loss

no code implementations31 Jul 2023 Zihan Zhang, Lei Shi, Ding-Xuan Zhou

In this paper, we aim to fill this gap by establishing a novel and elegant oracle-type inequality, which enables us to deal with the boundedness restriction of the target function, and using it to derive sharp convergence rates for fully connected ReLU DNN classifiers trained with logistic loss.

Binary Classification Classification +1

TEDi: Temporally-Entangled Diffusion for Long-Term Motion Synthesis

no code implementations27 Jul 2023 Zihan Zhang, Richard Liu, Kfir Aberman, Rana Hanocka

The gradual nature of a diffusion process that synthesizes samples in small increments constitutes a key ingredient of Denoising Diffusion Probabilistic Models (DDPM), which have presented unprecedented quality in image synthesis and been recently explored in the motion domain.

Denoising Image Generation +1

Settling the Sample Complexity of Online Reinforcement Learning

no code implementations25 Jul 2023 Zihan Zhang, Yuxin Chen, Jason D. Lee, Simon S. Du

While a number of recent works achieved asymptotically minimal regret in online RL, the optimality of these results is only guaranteed in a ``large-sample'' regime, imposing enormous burn-in cost in order for their algorithms to operate optimally.

reinforcement-learning Reinforcement Learning (RL)

Dual-Alignment Pre-training for Cross-lingual Sentence Embedding

1 code implementation16 May 2023 Ziheng Li, Shaohan Huang, Zihan Zhang, Zhi-Hong Deng, Qiang Lou, Haizhen Huang, Jian Jiao, Furu Wei, Weiwei Deng, Qi Zhang

Recent studies have shown that dual encoder models trained with the sentence-level translation ranking task are effective methods for cross-lingual sentence embedding.

Language Modelling Sentence +3

Pre-training Language Model as a Multi-perspective Course Learner

no code implementations6 May 2023 Beiduo Chen, Shaohan Huang, Zihan Zhang, Wu Guo, ZhenHua Ling, Haizhen Huang, Furu Wei, Weiwei Deng, Qi Zhang

Besides, two self-correction courses are proposed to bridge the chasm between the two encoders by creating a "correction notebook" for secondary-supervision.

Language Modelling Masked Language Modeling

Two-step Band-split Neural Network Approach for Full-band Residual Echo Suppression

no code implementations13 Mar 2023 Zihan Zhang, Shimin Zhang, Mingshuai Liu, Yanhong Leng, Zhe Han, Li Chen, Lei Xie

This paper describes a Two-step Band-split Neural Network (TBNN) approach for full-band acoustic echo cancellation.

Acoustic echo cancellation

Sharp Variance-Dependent Bounds in Reinforcement Learning: Best of Both Worlds in Stochastic and Deterministic Environments

no code implementations31 Jan 2023 Runlong Zhou, Zihan Zhang, Simon S. Du

We further initiate the study on model-free algorithms with variance-dependent regret bounds by designing a reference-function-based algorithm with a novel capped-doubling reference update schedule.

Decoupling MaxLogit for Out-of-Distribution Detection

no code implementations CVPR 2023 Zihan Zhang, Xiang Xiang

We demonstrate the effectiveness of our logit-based OOD detection methods on CIFAR-10, CIFAR-100 and ImageNet and establish state-of-the-art performance.

Out-of-Distribution Detection

RoChBert: Towards Robust BERT Fine-tuning for Chinese

1 code implementation28 Oct 2022 Zihan Zhang, Jinfeng Li, Ning Shi, Bo Yuan, Xiangyu Liu, Rong Zhang, Hui Xue, Donghong Sun, Chao Zhang

Despite of the superb performance on a wide range of tasks, pre-trained language models (e. g., BERT) have been proved vulnerable to adversarial texts.

Data Augmentation Language Modelling

Near-Optimal Regret Bounds for Multi-batch Reinforcement Learning

no code implementations15 Oct 2022 Zihan Zhang, Yuhang Jiang, Yuan Zhou, Xiangyang Ji

Meanwhile, we show that to achieve $\tilde{O}(\mathrm{poly}(S, A, H)\sqrt{K})$ regret, the number of batches is at least $\Omega\left(H/\log_A(K)+ \log_2\log_2(K) \right)$, which matches our upper bound up to logarithmic terms.

reinforcement-learning Reinforcement Learning (RL)

Predicting Blossom Date of Cherry Tree With Support Vector Machine and Recurrent Neural Network

1 code implementation10 Oct 2022 Hongyi Zheng, Yanyu Chen, Zihan Zhang

Our project probes the relationship between temperatures and the blossom date of cherry trees.

Hybrid Supervised and Reinforcement Learning for the Design and Optimization of Nanophotonic Structures

no code implementations8 Sep 2022 Christopher Yeung, Benjamin Pham, Zihan Zhang, Katherine T. Fountaine, Aaswath P. Raman

From higher computational efficiency to enabling the discovery of novel and complex structures, deep learning has emerged as a powerful framework for the design and optimization of nanophotonic circuits and components.

Computational Efficiency reinforcement-learning +1

A Pilot Study of Relating MYCN-Gene Amplification with Neuroblastoma-Patient CT Scans

no code implementations21 May 2022 Zihan Zhang, Xiang Xiang, Xuehua Peng, Jianbo Shao

Neuroblastoma is one of the most common cancers in infants, and the initial diagnosis of this disease is difficult.

GANimator: Neural Motion Synthesis from a Single Sequence

1 code implementation5 May 2022 Peizhuo Li, Kfir Aberman, Zihan Zhang, Rana Hanocka, Olga Sorkine-Hornung

We present GANimator, a generative model that learns to synthesize novel motions from a single, short motion sequence.

Motion Synthesis Style Transfer

Horizon-Free Reinforcement Learning in Polynomial Time: the Power of Stationary Policies

no code implementations24 Mar 2022 Zihan Zhang, Xiangyang Ji, Simon S. Du

This paper gives the first polynomial-time algorithm for tabular Markov Decision Processes (MDP) that enjoys a regret bound \emph{independent on the planning horizon}.

reinforcement-learning Reinforcement Learning (RL)

Long-Tailed Classification with Gradual Balanced Loss and Adaptive Feature Generation

no code implementations28 Feb 2022 Zihan Zhang, Xiang Xiang

The real-world data distribution is essentially long-tailed, which poses great challenge to the deep model.

Long-tail Learning

Callee: Recovering Call Graphs for Binaries with Transfer and Contrastive Learning

1 code implementation2 Nov 2021 Wenyu Zhu, Zhiyao Feng, Zihan Zhang, Jianjun Chen, Zhijian Ou, Min Yang, Chao Zhang

Recovering binary programs' call graphs is crucial for inter-procedural analysis tasks and applications based on them. transfer One of the core challenges is recognizing targets of indirect calls (i. e., indirect callees).

Contrastive Learning Question Answering +1

Improving Non-autoregressive Generation with Mixup Training

1 code implementation21 Oct 2021 Ting Jiang, Shaohan Huang, Zihan Zhang, Deqing Wang, Fuzhen Zhuang, Furu Wei, Haizhen Huang, Liangjie Zhang, Qi Zhang

While pre-trained language models have achieved great success on various natural language understanding tasks, how to effectively leverage them into non-autoregressive generation tasks remains a challenge.

Natural Language Understanding Paraphrase Generation +2

Learning from My Friends: Few-Shot Personalized Conversation Systems via Social Networks

no code implementations21 May 2021 Zhiliang Tian, Wei Bi, Zihan Zhang, Dongkyu Lee, Yiping Song, Nevin L. Zhang

The task requires models to generate personalized responses for a speaker given a few conversations from the speaker and a social network.

Meta-Learning

A New Metric on Symmetric Group and Applications to Block Permutation Codes

no code implementations9 Mar 2021 Chaoping Xing, Zihan Zhang

In this paper, by introducing a novel metric closely related to the block permutation metric, we build a bridge between some advanced algebraic methods and codes in the block permutation metric.

Information Theory Combinatorics Information Theory

Improved Variance-Aware Confidence Sets for Linear Bandits and Linear Mixture MDP

no code implementations NeurIPS 2021 Zihan Zhang, Jiaqi Yang, Xiangyang Ji, Simon S. Du

With the new confidence sets, we obtain the follow regret bounds: For linear bandits, we obtain an $\tilde{O}(poly(d)\sqrt{1 + \sum_{k=1}^{K}\sigma_k^2})$ data-dependent regret bound, where $d$ is the feature dimension, $K$ is the number of rounds, and $\sigma_k^2$ is the \emph{unknown} variance of the reward at the $k$-th round.

LEMMA

Almost Optimal Model-Free Reinforcement Learningvia Reference-Advantage Decomposition

no code implementations NeurIPS 2020 Zihan Zhang, Yuan Zhou, Xiangyang Ji

We study the reinforcement learning problem in the setting of finite-horizon1episodic Markov Decision Processes (MDPs) with S states, A actions, and episode length H. We propose a model-free algorithm UCB-ADVANTAGE and prove that it achieves \tilde{O}(\sqrt{H^2 SAT}) regret where T=KH and K is the number of episodes to play.

reinforcement-learning Reinforcement Learning (RL)

Nearly Minimax Optimal Reward-free Reinforcement Learning

no code implementations12 Oct 2020 Zihan Zhang, Simon S. Du, Xiangyang Ji

In the planning phase, the agent needs to return a near-optimal policy for arbitrary reward functions.

reinforcement-learning Reinforcement Learning (RL)

Is Reinforcement Learning More Difficult Than Bandits? A Near-optimal Algorithm Escaping the Curse of Horizon

no code implementations28 Sep 2020 Zihan Zhang, Xiangyang Ji, Simon S. Du

Episodic reinforcement learning generalizes contextual bandits and is often perceived to be more difficult due to long planning horizon and unknown state-dependent transitions.

Decision Making Multi-Armed Bandits +2

Model-Free Reinforcement Learning: from Clipped Pseudo-Regret to Sample Complexity

no code implementations6 Jun 2020 Zihan Zhang, Yuan Zhou, Xiangyang Ji

In this paper we consider the problem of learning an $\epsilon$-optimal policy for a discounted Markov Decision Process (MDP).

reinforcement-learning Reinforcement Learning (RL)

Almost Optimal Model-Free Reinforcement Learning via Reference-Advantage Decomposition

no code implementations21 Apr 2020 Zihan Zhang, Yuan Zhou, Xiangyang Ji

We study the reinforcement learning problem in the setting of finite-horizon episodic Markov Decision Processes (MDPs) with $S$ states, $A$ actions, and episode length $H$.

reinforcement-learning Reinforcement Learning (RL)

Regret Minimization for Reinforcement Learning by Evaluating the Optimal Bias Function

no code implementations NeurIPS 2019 Zihan Zhang, Xiangyang Ji

We present an algorithm based on the \emph{Optimism in the Face of Uncertainty} (OFU) principle which is able to learn Reinforcement Learning (RL) modeled by Markov decision process (MDP) with finite state-action space efficiently.

reinforcement-learning Reinforcement Learning (RL)

HAXMLNet: Hierarchical Attention Network for Extreme Multi-Label Text Classification

no code implementations24 Mar 2019 Ronghui You, Zihan Zhang, Suyang Dai, Shanfeng Zhu

Extreme multi-label text classification (XMTC) addresses the problem of tagging each text with the most relevant labels from an extreme-scale label set.

General Classification Multi Label Text Classification +2

AttentionXML: Label Tree-based Attention-Aware Deep Model for High-Performance Extreme Multi-Label Text Classification

3 code implementations NeurIPS 2019 Ronghui You, Zihan Zhang, Ziye Wang, Suyang Dai, Hiroshi Mamitsuka, Shanfeng Zhu

We propose a new label tree-based deep learning model for XMTC, called AttentionXML, with two unique features: 1) a multi-label attention mechanism with raw text as input, which allows to capture the most relevant part of text to each label; and 2) a shallow and wide probabilistic label tree (PLT), which allows to handle millions of labels, especially for "tail labels".

General Classification Multi-Label Text Classification +3

Cannot find the paper you are looking for? You can Submit a new open access paper.