Search Results for author: Yong Dai

Found 27 papers, 10 papers with code

Creating and Evaluating Resources for Sentiment Analysis in the Low-resource Language: Sindhi

no code implementations EACL (WASSA) 2021 Wazir Ali, Naveed Ali, Yong Dai, Jay Kumar, Saifullah Tumrani, Zenglin Xu

In this paper, we develop Sindhi subjective lexicon using a merger of existing English resources: NRC lexicon, list of opinion words, SentiWordNet, Sindhi-English bilingual dictionary, and collection of Sindhi modifiers.

Sentiment Analysis Subjectivity Analysis +1

Self-playing Adversarial Language Game Enhances LLM Reasoning

1 code implementation16 Apr 2024 Pengyu Cheng, Tianhao Hu, Han Xu, Zhisong Zhang, Yong Dai, Lei Han, Nan Du

Hence, we are curious about whether LLMs' reasoning ability can be further enhanced by Self-Play in this Adversarial language Game (SPAG).

Prior-agnostic Multi-scale Contrastive Text-Audio Pre-training for Parallelized TTS Frontend Modeling

no code implementations14 Apr 2024 Quanxiu Wang, Hui Huang, Mingjie Wang, Yong Dai, Jinzuomu Zhong, Benlai Tang

Furthermore, a parallelized TTS frontend model is delicately devised to execute TN, PD, and PBP prediction tasks, respectively in the second stage.

Polyphone disambiguation

Enhancing Zero-shot Counting via Language-guided Exemplar Learning

no code implementations8 Feb 2024 Mingjie Wang, Jun Zhou, Yong Dai, Eric Buys, Minglun Gong

Recently, Class-Agnostic Counting (CAC) problem has garnered increasing attention owing to its intriguing generality and superior efficiency compared to Category-Specific Counting (CSC).

Object Counting Zero-Shot Counting +1

Embracing Language Inclusivity and Diversity in CLIP through Continual Language Learning

1 code implementation30 Jan 2024 Bang Yang, Yong Dai, Xuxin Cheng, Yaowei Li, Asif Raza, Yuexian Zou

To alleviate CF raised by covariate shift and lexical overlap, we further propose a novel approach that ensures the identical distribution of all token embeddings during initialization and regularizes token embedding learning during training.

Text Retrieval

WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models

1 code implementation25 Jan 2024 Hongliang He, Wenlin Yao, Kaixin Ma, Wenhao Yu, Yong Dai, Hongming Zhang, Zhenzhong Lan, Dong Yu

The rapid advancement of large language models (LLMs) has led to a new era marked by the development of autonomous applications in real-world scenarios, which drives innovation in creating advanced web agents.

Emage: Non-Autoregressive Text-to-Image Generation

no code implementations22 Dec 2023 Zhangyin Feng, Runyi Hu, Liangxin Liu, Fan Zhang, Duyu Tang, Yong Dai, Xiaocheng Feng, Jiwei Li, Bing Qin, Shuming Shi

Compared with autoregressive baselines that needs to run one thousand times, our model only runs 16 times to generate images of competitive quality with an order of magnitude lower inference latency.

Denoising Text-to-Image Generation

On Diversified Preferences of Large Language Model Alignment

1 code implementation12 Dec 2023 Dun Zeng, Yong Dai, Pengyu Cheng, Longyue Wang, Tianhao Hu, Wanshun Chen, Nan Du, Zenglin Xu

Our analysis reveals a correlation between the calibration performance of reward models (RMs) and the alignment performance of LLMs.

Language Modelling Large Language Model

Adversarial Preference Optimization

1 code implementation14 Nov 2023 Pengyu Cheng, Yifan Yang, Jian Li, Yong Dai, Tianhao Hu, Peixin Cao, Nan Du

Human preference alignment is essential to improve the interaction quality of large language models (LLMs).

TencentLLMEval: A Hierarchical Evaluation of Real-World Capabilities for Human-Aligned LLMs

1 code implementation9 Nov 2023 Shuyi Xie, Wenlin Yao, Yong Dai, Shaobo Wang, Donlin Zhou, Lifeng Jin, Xinhua Feng, Pengzhi Wei, Yujie Lin, Zhichao Hu, Dong Yu, Zhengyou Zhang, Jing Nie, Yuhong Liu

We construct a hierarchical task tree encompassing 7 major areas covering over 200 categories and over 800 tasks, which covers diverse capabilities such as question answering, reasoning, multiturn dialogue, and text generation, to evaluate LLMs in a comprehensive and in-depth manner.

Benchmarking Question Answering +1

Everyone Deserves A Reward: Learning Customized Human Preferences

1 code implementation6 Sep 2023 Pengyu Cheng, Jiawen Xie, Ke Bai, Yong Dai, Nan Du

Besides, from the perspective of data efficiency, we propose a three-stage customized RM learning scheme, then empirically verify its effectiveness on both general preference datasets and our DSP set.

Imitation Learning

Chunk, Align, Select: A Simple Long-sequence Processing Method for Transformers

no code implementations25 Aug 2023 Jiawen Xie, Pengyu Cheng, Xiao Liang, Yong Dai, Nan Du

Although dominant in natural language processing, transformer-based models remain challenged by the task of long-sequence processing, because the computational cost of self-attention operations in transformers swells quadratically with the input sequence length.

Reading Comprehension Text Summarization

SkillNet-X: A Multilingual Multitask Model with Sparsely Activated Skills

no code implementations28 Jun 2023 Zhangyin Feng, Yong Dai, Fan Zhang, Duyu Tang, Xiaocheng Feng, Shuangzhi Wu, Bing Qin, Yunbo Cao, Shuming Shi

Traditional multitask learning methods basically can only exploit common knowledge in task- or language-wise, which lose either cross-language or cross-task knowledge.

Natural Language Understanding

When Federated Learning Meets Pre-trained Language Models' Parameter-Efficient Tuning Methods

1 code implementation20 Dec 2022 Zhuo Zhang, Yuanhang Yang, Yong Dai, Lizhen Qu, Zenglin Xu

To facilitate the research of PETuning in FL, we also develop a federated tuning framework FedPETuning, which allows practitioners to exploit different PETuning methods under the FL training paradigm conveniently.

Federated Learning

Pretrained Language Encoders are Natural Tagging Frameworks for Aspect Sentiment Triplet Extraction

no code implementations20 Aug 2022 Yanjie Gou, Yinjie Lei, Lingqiao Liu, Yong Dai, Chunxu Shen, Yongqi Tong

Existing works usually formulate the span detection as a 1D token tagging problem, and model the sentiment recognition with a 2D tagging matrix of token pairs.

Aspect Sentiment Triplet Extraction Inductive Bias

Effidit: Your AI Writing Assistant

no code implementations3 Aug 2022 Shuming Shi, Enbo Zhao, Duyu Tang, Yan Wang, Piji Li, Wei Bi, Haiyun Jiang, Guoping Huang, Leyang Cui, Xinting Huang, Cong Zhou, Yong Dai, Dongyang Ma

In Effidit, we significantly expand the capacities of a writing assistant by providing functions in five categories: text completion, error checking, text polishing, keywords to sentences (K2S), and cloud input methods (cloud IME).

Keywords to Sentences Retrieval +3

One Model, Multiple Modalities: A Sparsely Activated Approach for Text, Sound, Image, Video and Code

no code implementations12 May 2022 Yong Dai, Duyu Tang, Liangxin Liu, Minghuan Tan, Cong Zhou, Jingquan Wang, Zhangyin Feng, Fan Zhang, Xueyu Hu, Shuming Shi

Moreover, our model supports self-supervised pretraining with the same sparsely activated way, resulting in better initialized parameters for different modalities.

Image Retrieval Retrieval

Pretraining Chinese BERT for Detecting Word Insertion and Deletion Errors

no code implementations26 Apr 2022 Cong Zhou, Yong Dai, Duyu Tang, Enbo Zhao, Zhangyin Feng, Li Kuang, Shuming Shi

We achieve this by introducing a special token \texttt{[null]}, the prediction of which stands for the non-existence of a word.

Language Modelling Masked Language Modeling +1

Unsupervised Sentiment Analysis by Transferring Multi-source Knowledge

no code implementations9 May 2021 Yong Dai, Jian Liu, Jian Zhang, Hongguang Fu, Zenglin Xu

The first mechanism is a selective domain adaptation (SDA) method, which transfers knowledge from the closest source domain.

Domain Adaptation Sentiment Analysis

Contextualize Knowledge Bases with Transformer for End-to-end Task-Oriented Dialogue Systems

no code implementations EMNLP 2021 Yanjie Gou, Yinjie Lei, Lingqiao Liu, Yong Dai, Chunxu Shen

Incorporating knowledge bases (KB) into end-to-end task-oriented dialogue systems is challenging, since it requires to properly represent the entity of KB, which is associated with its KB context and dialogue context.

Response Generation Task-Oriented Dialogue Systems

Adversarial Training Based Multi-Source Unsupervised Domain Adaptation for Sentiment Analysis

no code implementations10 Jun 2020 Yong Dai, Jian Liu, Xiancong Ren, Zenglin Xu

Existing algorithms of MS-UDA either only exploit the shared features, i. e., the domain-invariant information, or based on some weak assumption in NLP, e. g., smoothness assumption.

Multi-Source Unsupervised Domain Adaptation Sentiment Analysis +2

Cannot find the paper you are looking for? You can Submit a new open access paper.