Search Results for author: Yutao Zhu

Found 34 papers, 22 papers with code

基于双星型自注意力网络的搜索结果多样化方法(Search Result Diversification Framework Based on Dual Star-shaped Self-Attention Network)

no code implementations • CCL 2021 • Xubo Qin, Zhicheng Dou, Yutao Zhu, JiRong Wen

“相关研究指出, 用户提交给搜索引擎的查询通常为短查询。由于自然语言本身的特点, 短查询通常具有歧义性, 同一个查询可以指代不同的事物, 或同一事物的不同方面。为了让搜索结果尽可能满足用户多样化的信息需求, 搜索引擎需要对返回的结果进行多样化排序, 搜索结果多样化技术应运而生。目前已有的基于全局交互的多样化方法通过全连接的自注意力网络捕获全体候选文档间的交互关系, 取得了较好的效果。但由于此类方法只考虑文档间的相关关系, 并没有考虑到文档是否具有跟查询相关的有效信息, 在训练数据有限的条件下效率相对较低。该文提出了一种基于双星型自注意力网络的搜索结果多样化方法, 将全连接结构改为星型拓扑结构, 并嵌入查询信息以高效率地提取文档跟查询相关的全局交互特征。相关实验结果显示, 该模型相对于基于全连接自注意力网络的多样化方法, 具备显著的性能优势。”

Paper
Add Code

From Matching to Generation: A Survey on Generative Information Retrieval

1 code implementation • 23 Apr 2024 • Xiaoxi Li, Jiajie Jin, Yujia Zhou, Yuyao Zhang, Peitian Zhang, Yutao Zhu, Zhicheng Dou

We will summarize the advancements in GR regarding model training, document identifier, incremental learning, downstream tasks adaptation, multi-modal GR and generative recommendation, as well as progress in reliable response generation in aspects of internal knowledge memorization, external knowledge augmentation, generating response with citations and personal information assistant.

Incremental Learning Information Retrieval +5

Paper
Code

An Integrated Data Processing Framework for Pretraining Foundation Models

1 code implementation • 26 Feb 2024 • Yiding Sun, Feng Wang, Yutao Zhu, Wayne Xin Zhao, Jiaxin Mao

The ability of the foundation models heavily relies on large-scale, diverse, and high-quality pretraining data.

Paper
Code

UFO: a Unified and Flexible Framework for Evaluating Factuality of Large Language Models

1 code implementation • 22 Feb 2024 • Zhaoheng Huang, Zhicheng Dou, Yutao Zhu, Ji-Rong Wen

To address these challenges, we categorize four available fact sources: human-written evidence, reference documents, search engine results, and LLM knowledge, along with five text generation tasks containing six representative datasets.

Hallucination Retrieval +1

Paper
Code

Small Models, Big Insights: Leveraging Slim Proxy Models To Decide When and What to Retrieve for LLMs

1 code implementation • 19 Feb 2024 • Jiejun Tan, Zhicheng Dou, Yutao Zhu, Peidong Guo, Kun Fang, Ji-Rong Wen

The integration of large language models (LLMs) and search engines represents a significant evolution in knowledge acquisition methodologies.

Question Answering

Paper
Code

BIDER: Bridging Knowledge Inconsistency for Efficient Retrieval-Augmented LLMs via Key Supporting Evidence

no code implementations • 19 Feb 2024 • Jiajie Jin, Yutao Zhu, Yujia Zhou, Zhicheng Dou

Retrieval-augmented large language models (LLMs) have demonstrated efficacy in knowledge-intensive tasks such as open-domain QA, addressing inherent challenges in knowledge update and factual inadequacy.

Question Answering Retrieval

Paper
Add Code

INTERS: Unlocking the Power of Large Language Models in Search with Instruction Tuning

1 code implementation • 12 Jan 2024 • Yutao Zhu, Peitian Zhang, Chenghao Zhang, Yifei Chen, Binyu Xie, Zhicheng Dou, Zheng Liu, Ji-Rong Wen

Despite this, their application to information retrieval (IR) tasks is still challenging due to the infrequent occurrence of many IR-specific concepts in natural language.

document understanding Information Retrieval +2

183

Paper
Code

Don't Make Your LLM an Evaluation Benchmark Cheater

no code implementations • 3 Nov 2023 • Kun Zhou, Yutao Zhu, Zhipeng Chen, Wentong Chen, Wayne Xin Zhao, Xu Chen, Yankai Lin, Ji-Rong Wen, Jiawei Han

Large language models~(LLMs) have greatly advanced the frontiers of artificial intelligence, attaining remarkable improvement in model capacity.

Paper
Add Code

Large Language Models for Information Retrieval: A Survey

1 code implementation • 14 Aug 2023 • Yutao Zhu, Huaying Yuan, Shuting Wang, Jiongnan Liu, Wenhan Liu, Chenlong Deng, Haonan Chen, Zhicheng Dou, Ji-Rong Wen

This evolution requires a combination of both traditional methods (such as term-based sparse retrieval methods with rapid response) and modern neural architectures (such as language models with powerful language understanding capacity).

Information Retrieval Question Answering +2

313

Paper
Code

Learning to Relate to Previous Turns in Conversational Search

1 code implementation • 5 Jun 2023 • Fengran Mo, Jian-Yun Nie, Kaiyu Huang, Kelong Mao, Yutao Zhu, Peng Li, Yang Liu

An effective way to improve retrieval effectiveness is to expand the current query with historical queries.

Conversational Search Multi-Task Learning +1

Paper
Code

ConvGQR: Generative Query Reformulation for Conversational Search

1 code implementation • 25 May 2023 • Fengran Mo, Kelong Mao, Yutao Zhu, Yihong Wu, Kaiyu Huang, Jian-Yun Nie

In this paper, we propose ConvGQR, a new framework to reformulate conversational queries based on generative pre-trained language models (PLMs), one for query rewriting and another for generating potential answers.

Conversational Search Retrieval

Paper
Code

WebBrain: Learning to Generate Factually Correct Articles for Queries by Grounding on Large Web Corpus

1 code implementation • 10 Apr 2023 • Hongjing Qian, Yutao Zhu, Zhicheng Dou, Haoqi Gu, Xinyu Zhang, Zheng Liu, Ruofei Lai, Zhao Cao, Jian-Yun Nie, Ji-Rong Wen

In this paper, we introduce a new NLP task -- generating short factual articles with references for queries by mining supporting evidence from the Web.

Retrieval Text Generation

Paper
Code

A Text-guided Protein Design Framework

1 code implementation • 9 Feb 2023 • Shengchao Liu, Yanjing Li, Zhuoxinran Li, Anthony Gitter, Yutao Zhu, Jiarui Lu, Zhao Xu, Weili Nie, Arvind Ramanathan, Chaowei Xiao, Jian Tang, Hongyu Guo, Anima Anandkumar

Current AI-assisted protein design mainly utilizes protein sequential and structural information.

Property Prediction Protein Design

119

Paper
Code

An Empirical Study of Uniform-Architecture Knowledge Distillation in Document Ranking

no code implementations • 8 Feb 2023 • Xubo Qin, Xiyuan Liu, Xiongfeng Zheng, Jie Liu, Yutao Zhu

Specifically, when the student models are in cross-encoder architecture, a pairwise loss of hard labels is critical for training student models, whereas the distillation objectives of intermediate Transformer layers may hurt performance.

Document Ranking Knowledge Distillation

Paper
Add Code

MCP: Self-supervised Pre-training for Personalized Chatbots with Multi-level Contrastive Sampling

no code implementations • 17 Oct 2022 • Zhaoheng Huang, Zhicheng Dou, Yutao Zhu, Zhengyi Ma

To tackle these problems, we propose a self-supervised learning framework MCP for capturing better representations from users' dialogue history for personalized chatbots.

Response Generation Self-Supervised Learning

Paper
Add Code

Pre-training for Information Retrieval: Are Hyperlinks Fully Explored?

no code implementations • 14 Sep 2022 • Jiawen Wu, Xinyu Zhang, Yutao Zhu, Zheng Liu, Zikai Guo, Zhaoye Fei, Ruofei Lai, Yongkang Wu, Zhao Cao, Zhicheng Dou

Hyperlinks, which are commonly used in Web pages, have been leveraged for designing pre-training objectives.

Information Retrieval Question Answering +1

Paper
Add Code

Enhancing User Behavior Sequence Modeling by Generative Tasks for Session Search

1 code implementation • 23 Aug 2022 • Haonan Chen, Zhicheng Dou, Yutao Zhu, Zhao Cao, Xiaohua Cheng, Ji-Rong Wen

To help the encoding of the current user behavior sequence, we propose to use a decoder and the information of future sequences and a supplemental query.

Session Search

Paper
Code

From Easy to Hard: A Dual Curriculum Learning Framework for Context-Aware Document Ranking

1 code implementation • 22 Aug 2022 • Yutao Zhu, Jian-Yun Nie, Yixuan Su, Haonan Chen, Xinyu Zhang, Zhicheng Dou

In this work, we propose a curriculum learning framework for context-aware document ranking, in which the ranking model learns matching signals between the search context and the candidate document in an easy-to-hard manner.

Document Ranking

Paper
Code

Coarse-to-Fine: Hierarchical Multi-task Learning for Natural Language Understanding

no code implementations • COLING 2022 • Zhaoye Fei, Yu Tian, Yongkang Wu, Xinyu Zhang, Yutao Zhu, Zheng Liu, Jiawen Wu, Dejiang Kong, Ruofei Lai, Zhao Cao, Zhicheng Dou, Xipeng Qiu

Our experiments on 13 benchmark datasets across five natural language understanding tasks demonstrate the superiority of our method.

Multi-Task Learning Natural Language Understanding

Paper
Add Code

PReGAN: Answer Oriented Passage Ranking with Weakly Supervised GAN

no code implementations • 5 Jul 2022 • Pan Du, Jian-Yun Nie, Yutao Zhu, Hao Jiang, Lixin Zou, Xiaohui Yan

Beyond topical relevance, passage ranking for open-domain factoid question answering also requires a passage to contain an answer (answerability).

Passage Ranking Question Answering

Paper
Add Code

Less is More: Learning to Refine Dialogue History for Personalized Dialogue Generation

no code implementations • NAACL 2022 • Hanxun Zhong, Zhicheng Dou, Yutao Zhu, Hongjin Qian, Ji-Rong Wen

Existing personalized dialogue systems have tried to extract user profiles from dialogue history to guide personalized response generation.

Dialogue Generation Response Generation

Paper
Add Code

PSSL: Self-supervised Learning for Personalized Search with Contrastive Sampling

1 code implementation • 24 Nov 2021 • Yujia Zhou, Zhicheng Dou, Yutao Zhu, Ji-Rong Wen

Personalized search plays a crucial role in improving user search experience owing to its ability to build user profiles based on historical behaviors.

Self-Supervised Learning Sentence

Paper
Code

Contrastive Learning of User Behavior Sequence for Context-Aware Document Ranking

1 code implementation • 24 Aug 2021 • Yutao Zhu, Jian-Yun Nie, Zhicheng Dou, Zhengyi Ma, Xinyu Zhang, Pan Du, Xiaochen Zuo, Hao Jiang

To learn a more robust representation of the user behavior sequence, we propose a method based on contrastive learning, which takes into account the possible variations in user's behavior sequences.

Contrastive Learning Data Augmentation +1

Paper
Code

One Chatbot Per Person: Creating Personalized Chatbots based on Implicit User Profiles

1 code implementation • 20 Aug 2021 • Zhengyi Ma, Zhicheng Dou, Yutao Zhu, Hanxun Zhong, Ji-Rong Wen

Specifically, leveraging the benefits of Transformer on language understanding, we train a personalized language model to construct a general user profile from the user's historical responses.

Chatbot Language Modelling

Paper
Code

Learning Implicit User Profiles for Personalized Retrieval-Based Chatbot

1 code implementation • 18 Aug 2021 • Hongjin Qian, Zhicheng Dou, Yutao Zhu, Yueyuan Ma, Ji-Rong Wen

To learn a user's personalized language style, we elaborately build language models from shallow to deep using the user's historical responses; To model a user's personalized preferences, we explore the conditional relations underneath each post-response pair of the user.

Chatbot Retrieval

Paper
Code

Proactive Retrieval-based Chatbots based on Relevant Knowledge and Goals

1 code implementation • 18 Jul 2021 • Yutao Zhu, Jian-Yun Nie, Kun Zhou, Pan Du, Hao Jiang, Zhicheng Dou

The final response is selected according to the predicted knowledge, the goal to achieve, and the context.

Multi-Task Learning Retrieval

Paper
Code

Emotion Eliciting Machine: Emotion Eliciting Conversation Generation based on Dual Generator

no code implementations • 18 May 2021 • Hao Jiang, Yutao Zhu, Xinyu Zhang, Zhicheng Dou, Pan Du, Te Pi, Yantao Jia

Then we propose a dual encoder-decoder structure to model the generation of responses in both positive and negative side based on the changes of the user's emotion status in the conversation.

Paper
Add Code

BERT4SO: Neural Sentence Ordering by Fine-tuning BERT

no code implementations • 25 Mar 2021 • Yutao Zhu, Jian-Yun Nie, Kun Zhou, Shengchao Liu, Yabo Ling, Pan Du

Sentence ordering aims to arrange the sentences of a given text in the correct order.

Sentence Sentence Ordering

Paper
Add Code

Neural Sentence Ordering Based on Constraint Graphs

1 code implementation • 27 Jan 2021 • Yutao Zhu, Kun Zhou, Jian-Yun Nie, Shengchao Liu, Zhicheng Dou

Our experiments on five benchmark datasets show that our method outperforms all the existing baselines significantly, achieving a new state-of-the-art performance.

Sentence Sentence Ordering

Paper
Code

Content Selection Network for Document-grounded Retrieval-based Chatbots

1 code implementation • 21 Jan 2021 • Yutao Zhu, Jian-Yun Nie, Kun Zhou, Pan Du, Zhicheng Dou

It is thus crucial to select the part of document content relevant to the current conversation context.

Retrieval

Paper
Code

Pchatbot: A Large-Scale Dataset for Personalized Chatbot

2 code implementations • 28 Sep 2020 • Hongjin Qian, Xiaohe Li, Hanxun Zhong, Yu Guo, Yueyuan Ma, Yutao Zhu, Zhanliang Liu, Zhicheng Dou, Ji-Rong Wen

This enables the development of personalized dialogue models that directly learn implicit user personality from the user's dialogue history.

Chatbot

Paper
Code

S^3-Rec: Self-Supervised Learning for Sequential Recommendation with Mutual Information Maximization

2 code implementations • 18 Aug 2020 • Kun Zhou, Hui Wang, Wayne Xin Zhao, Yutao Zhu, Sirui Wang, Fuzheng Zhang, Zhongyuan Wang, Ji-Rong Wen

To tackle this problem, we propose the model S^3-Rec, which stands for Self-Supervised learning for Sequential Recommendation, based on the self-attentive neural architecture.

Attribute Self-Supervised Learning +1

229

Paper
Code

ScriptWriter: Narrative-Guided Script Generation

1 code implementation • ACL 2020 • Yutao Zhu, Ruihua Song, Zhicheng Dou, Jian-Yun Nie, Jin Zhou

In dialogue systems, it would also be useful to drive dialogues by a dialogue plan.

Paper
Code

Improving Multi-Turn Response Selection Models with Complementary Last-Utterance Selection by Instance Weighting

no code implementations • 18 Feb 2020 • Kun Zhou, Wayne Xin Zhao, Yutao Zhu, Ji-Rong Wen, Jingsong Yu

Open-domain retrieval-based dialogue systems require a considerable amount of training data to learn their parameters.

Retrieval

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.