1 code implementation • 16 Apr 2024 • Xiaoyang Chen, Ben He, Hongyu Lin, Xianpei Han, Tianshu Wang, Boxi Cao, Le Sun, Yingfei Sun
The practice of Retrieval-Augmented Generation (RAG), which integrates Large Language Models (LLMs) with retrieval systems, has become increasingly prevalent.
1 code implementation • 2 Apr 2024 • Ying Zhou, Ben He, Le Sun
While well-trained text detectors have demonstrated promising performance on unseen test data, recent research suggests that these detectors have vulnerabilities when dealing with adversarial attacks such as paraphrasing.
no code implementations • 23 Feb 2024 • Qiaoyu Tang, Jiawei Chen, Bowen Yu, Yaojie Lu, Cheng Fu, Haiyang Yu, Hongyu Lin, Fei Huang, Ben He, Xianpei Han, Le Sun, Yongbin Li
The rise of large language models (LLMs) has transformed the role of information retrieval (IR) systems in the way to humans accessing information.
no code implementations • 22 Feb 2024 • Ning Bian, Xianpei Han, Hongyu Lin, Yaojie Lu, Ben He, Le Sun
Building machines with commonsense has been a longstanding challenge in NLP due to the reporting bias of commonsense rules and the exposure bias of rule-based commonsense reasoning.
1 code implementation • 1 Feb 2024 • Xinlin Peng, Ying Zhou, Ben He, Le Sun, Yingfei Sun
This paper aims to bridge this gap by constructing AIG-ASAP, an AI-generated student essay dataset, employing a range of text perturbation methods that are expected to generate high-quality essays while evading detection.
no code implementations • 22 Nov 2023 • Xinyan Guan, Yanjiang Liu, Hongyu Lin, Yaojie Lu, Ben He, Xianpei Han, Le Sun
Incorporating factual knowledge in knowledge graph is regarded as a promising approach for mitigating the hallucination of large language models (LLMs).
1 code implementation • 20 Aug 2023 • Xueru Wen, Xiaoyang Chen, Xuanang Chen, Ben He, Le Sun
Dense retrieval has made significant advancements in information retrieval (IR) by achieving high levels of effectiveness while maintaining online efficiency during a single-pass retrieval process.
no code implementations • 31 Jul 2023 • Xuanang Chen, Ben He, Le Sun, Yingfei Sun
Neural ranking models (NRMs) have undergone significant development and have become integral components of information retrieval (IR) systems.
no code implementations • 8 May 2023 • Ning Bian, Hongyu Lin, Peilin Liu, Yaojie Lu, Chunkang Zhang, Ben He, Xianpei Han, Le Sun
LLMs, as AI agents, can observe external information, which shapes their cognition and behaviors.
no code implementations • 3 May 2023 • Xuanang Chen, Ben He, Zheng Ye, Le Sun, Yingfei Sun
Additionally, current methods rely heavily on the use of a well-imitated surrogate NRM to guarantee the attack effect, which makes them difficult to use in practice.
1 code implementation • 3 May 2023 • Xiaoyang Chen, Yanjiang Liu, Ben He, Le Sun, Yingfei Sun
The Differentiable Search Index (DSI) is a novel information retrieval (IR) framework that utilizes a differentiable function to generate a sorted list of document identifiers in response to a given query.
no code implementations • 29 Mar 2023 • Ning Bian, Xianpei Han, Le Sun, Hongyu Lin, Yaojie Lu, Ben He, Shanshan Jiang, Bin Dong
(4) Can ChatGPT effectively leverage commonsense for answering questions?
no code implementations • 9 May 2022 • Ying Zhou, Xuanang Chen, Ben He, Zheng Ye, Le Sun
Knowledge graph completion (KGC) aims to infer missing knowledge triples based on known facts in a knowledge graph.
1 code implementation • 25 Apr 2022 • Xiaoyang Chen, Ben He, Le Sun
While large-scale pre-trained language models like BERT have advanced the state-of-the-art in IR, its application in query performance prediction (QPP) is so far based on pointwise modeling of individual queries.
no code implementations • 19 Jul 2021 • Ning Bian, Xianpei Han, Bo Chen, Hongyu Lin, Ben He, Le Sun
In this paper, we propose a new framework for unsupervised MRC.
no code implementations • 17 Apr 2021 • Xiaoyang Chen, Kai Hui, Ben He, Xianpei Han, Le Sun, Zheng Ye
BERT-based text ranking models have dramatically advanced the state-of-the-art in ad-hoc retrieval, wherein most models tend to consider individual query-document pairs independently.
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Lingyong Yan, Xianpei Han, Ben He, Le Sun
Bootstrapping for entity set expansion (ESE) has been studied for a long period, which expands new entities using only a few seed entities as supervision.
4 code implementations • 16 Sep 2020 • Xuanang Chen, Ben He, Kai Hui, Le Sun, Yingfei Sun
Despite the effectiveness of utilizing the BERT model for document ranking, the high computational cost of such approaches limits their uses.
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Zhi Zheng, Kai Hui, Ben He, Xianpei Han, Le Sun, Andrew Yates
Query expansion aims to mitigate the mismatch between the language used in a query and in a document.
1 code implementation • 20 Aug 2020 • Canjia Li, Andrew Yates, Sean MacAvaney, Ben He, Yingfei Sun
In this work, we explore strategies for aggregating relevance signals from a document's passages into a final ranking score.
Ranked #2 on Ad-Hoc Information Retrieval on TREC Robust04
no code implementations • IJCNLP 2019 • Lingyong Yan, Xianpei Han, Le Sun, Ben He
Bootstrapping for Entity Set Expansion (ESE) aims at iteratively acquiring new instances of a specific target category.
no code implementations • 20 May 2019 • Yiyu Wang, Jungang Xu, Yingfei Sun, Ben He
Image captioning is a challenging task and attracting more and more attention in the field of Artificial Intelligence, and which can be applied to efficient image retrieval, intelligent blind guidance and human-computer interaction, etc.
1 code implementation • EMNLP 2018 • Canjia Li, Yingfei Sun, Ben He, Le Wang, Kai Hui, Andrew Yates, Le Sun, Jungang Xu
Pseudo-relevance feedback (PRF) is commonly used to boost the performance of traditional information retrieval (IR) models by using top-ranked documents to identify and weight new query terms, thereby reducing the effect of query-document vocabulary mismatches.
Ranked #9 on Ad-Hoc Information Retrieval on TREC Robust04
no code implementations • ACL 2018 • Cancan Jin, Ben He, Kai Hui, Le Sun
Existing automated essay scoring (AES) models rely on rated essays for the target prompt as training data.