Search Results for author: Honglei Zhuang

Found 28 papers, 5 papers with code

Generating Diverse Criteria On-the-Fly to Improve Point-wise LLM Rankers

1 code implementation • 18 Apr 2024 • Fang Guo, Wenyu Li, Honglei Zhuang, Yun Luo, Yafu Li, Le Yan, Yue Zhang

The most recent pointwise Large Language Model (LLM) rankers have achieved remarkable ranking results.

Paper
Code

Consolidating Ranking and Relevance Predictions of Large Language Models through Post-Processing

no code implementations • 17 Apr 2024 • Le Yan, Zhen Qin, Honglei Zhuang, Rolf Jagerman, Xuanhui Wang, Michael Bendersky, Harrie Oosterhuis

Our method takes both LLM generated relevance labels and pairwise preferences.

Paper
Add Code

Generate, Filter, and Fuse: Query Expansion via Multi-Step Keyword Generation for Zero-Shot Neural Rankers

no code implementations • 15 Nov 2023 • Minghan Li, Honglei Zhuang, Kai Hui, Zhen Qin, Jimmy Lin, Rolf Jagerman, Xuanhui Wang, Michael Bendersky

We first show that directly applying the expansion techniques in the current literature to state-of-the-art neural rankers can result in deteriorated zero-shot performance.

Instruction Following Language Modelling +1

Paper
Add Code

PaRaDe: Passage Ranking using Demonstrations with Large Language Models

no code implementations • 22 Oct 2023 • Andrew Drozdov, Honglei Zhuang, Zhuyun Dai, Zhen Qin, Razieh Rahimi, Xuanhui Wang, Dana Alon, Mohit Iyyer, Andrew McCallum, Donald Metzler, Kai Hui

Recent studies show that large language models (LLMs) can be instructed to effectively perform zero-shot passage re-ranking, in which the results of a first stage retrieval method, such as BM25, are rated and reordered to improve relevance.

Passage Ranking Passage Re-Ranking +6

Paper
Add Code

Beyond Yes and No: Improving Zero-Shot LLM Rankers via Scoring Fine-Grained Relevance Labels

no code implementations • 21 Oct 2023 • Honglei Zhuang, Zhen Qin, Kai Hui, Junru Wu, Le Yan, Xuanhui Wang, Michael Bendersky

We propose to incorporate fine-grained relevance labels into the prompt for LLM rankers, enabling them to better differentiate among documents with different levels of relevance to the query and thus derive a more accurate ranking.

Paper
Add Code

A Setwise Approach for Effective and Highly Efficient Zero-shot Ranking with Large Language Models

1 code implementation • 14 Oct 2023 • Shengyao Zhuang, Honglei Zhuang, Bevan Koopman, Guido Zuccon

Our approach reduces the number of LLM inferences and the amount of prompt token consumption during the ranking procedure, significantly improving the efficiency of LLM-based zero-shot ranking.

Document Ranking

Paper
Code

Large Language Models are Effective Text Rankers with Pairwise Ranking Prompting

no code implementations • 30 Jun 2023 • Zhen Qin, Rolf Jagerman, Kai Hui, Honglei Zhuang, Junru Wu, Le Yan, Jiaming Shen, Tianqi Liu, Jialu Liu, Donald Metzler, Xuanhui Wang, Michael Bendersky

Ranking documents using Large Language Models (LLMs) by directly feeding the query and candidate documents into the prompt is an interesting and practical problem.

Paper
Add Code

How Does Generative Retrieval Scale to Millions of Passages?

no code implementations • 19 May 2023 • Ronak Pradeep, Kai Hui, Jai Gupta, Adam D. Lelkes, Honglei Zhuang, Jimmy Lin, Donald Metzler, Vinh Q. Tran

Popularized by the Differentiable Search Index, the emerging paradigm of generative retrieval re-frames the classic information retrieval problem into a sequence-to-sequence modeling task, forgoing external indices and encoding an entire document corpus within a single Transformer.

Information Retrieval Passage Ranking +1

Paper
Add Code

Query Expansion by Prompting Large Language Models

no code implementations • 5 May 2023 • Rolf Jagerman, Honglei Zhuang, Zhen Qin, Xuanhui Wang, Michael Bendersky

Query expansion is a widely used technique to improve the recall of search systems.

Paper
Add Code

Towards Disentangling Relevance and Bias in Unbiased Learning to Rank

no code implementations • 28 Dec 2022 • Yunan Zhang, Le Yan, Zhen Qin, Honglei Zhuang, Jiaming Shen, Xuanhui Wang, Michael Bendersky, Marc Najork

We give both theoretical analysis and empirical results to show the negative effects on relevance tower due to such a correlation.

Learning-To-Rank

Paper
Add Code

RankT5: Fine-Tuning T5 for Text Ranking with Ranking Losses

no code implementations • 12 Oct 2022 • Honglei Zhuang, Zhen Qin, Rolf Jagerman, Kai Hui, Ji Ma, Jing Lu, Jianmo Ni, Xuanhui Wang, Michael Bendersky

Recently, substantial progress has been made in text ranking based on pretrained language models such as BERT.

Paper
Add Code

Retrieval Augmentation for T5 Re-ranker using External Sources

no code implementations • 11 Oct 2022 • Kai Hui, Tao Chen, Zhen Qin, Honglei Zhuang, Fernando Diaz, Mike Bendersky, Don Metzler

Retrieval augmentation has shown promising improvements in different tasks.

Language Modelling Large Language Model +2

Paper
Add Code

ED2LM: Encoder-Decoder to Language Model for Faster Document Re-ranking Inference

no code implementations • Findings (ACL) 2022 • Kai Hui, Honglei Zhuang, Tao Chen, Zhen Qin, Jing Lu, Dara Bahri, Ji Ma, Jai Prakash Gupta, Cicero Nogueira dos santos, Yi Tay, Don Metzler

This results in significant inference time speedups since the decoder-only architecture only needs to learn to interpret static encoder embeddings during inference.

Information Retrieval Language Modelling +2

Paper
Add Code

Rank4Class: A Ranking Formulation for Multiclass Classification

no code implementations • 17 Dec 2021 • Nan Wang, Zhen Qin, Le Yan, Honglei Zhuang, Xuanhui Wang, Michael Bendersky, Marc Najork

Multiclass classification (MCC) is a fundamental machine learning problem of classifying each instance into one of a predefined set of classes.

Classification Image Classification +4

Paper
Add Code

ExT5: Towards Extreme Multi-Task Scaling for Transfer Learning

3 code implementations • ICLR 2022 • Vamsi Aribandi, Yi Tay, Tal Schuster, Jinfeng Rao, Huaixiu Steven Zheng, Sanket Vaibhav Mehta, Honglei Zhuang, Vinh Q. Tran, Dara Bahri, Jianmo Ni, Jai Gupta, Kai Hui, Sebastian Ruder, Donald Metzler

Despite the recent success of multi-task learning and transfer learning for natural language processing (NLP), few works have systematically studied the effect of scaling up the number of tasks during pre-training.

Denoising Multi-Task Learning

5,893

Paper
Code

Improving Neural Ranking via Lossless Knowledge Distillation

no code implementations • 30 Sep 2021 • Zhen Qin, Le Yan, Yi Tay, Honglei Zhuang, Xuanhui Wang, Michael Bendersky, Marc Najork

We explore a novel perspective of knowledge distillation (KD) for learning to rank (LTR), and introduce Self-Distilled neural Rankers (SDR), where student rankers are parameterized identically to their teachers.

Knowledge Distillation Learning-To-Rank

Paper
Add Code

Rank4Class: Examining Multiclass Classification through the Lens of Learning to Rank

no code implementations • 29 Sep 2021 • Nan Wang, Zhen Qin, Le Yan, Honglei Zhuang, Xuanhui Wang, Michael Bendersky, Marc Najork

We further demonstrate that the most popular MCC architecture in deep learning can be mathematically formulated as a LTR pipeline equivalently, with a specific set of choices in terms of ranking model architecture and loss function.

Image Classification Information Retrieval +4

Paper
Add Code

Distilling Interpretable Models into Human-Readable Code

1 code implementation • 21 Jan 2021 • Walker Ravina, Ethan Sterling, Olexiy Oryeshko, Nathan Bell, Honglei Zhuang, Xuanhui Wang, Yonghui Wu, Alexander Grushetsky

The goal of model distillation is to faithfully transfer teacher model knowledge to a model which is faster, more generalizable, more interpretable, or possesses other desirable characteristics.

Paper
Code

Neural Rankers are hitherto Outperformed by Gradient Boosted Decision Trees

no code implementations • ICLR 2021 • Zhen Qin, Le Yan, Honglei Zhuang, Yi Tay, Rama Kumar Pasumarthi, Xuanhui Wang, Michael Bendersky, Marc Najork

We first validate this concern by showing that most recent neural LTR models are, by a large margin, inferior to the best publicly available Gradient Boosted Decision Trees (GBDT) in terms of their reported ranking accuracy on benchmark datasets.

Learning-To-Rank

Paper
Add Code

What Makes a Star Teacher? A Hierarchical BERT Model for Evaluating Teacher's Performance in Online Education

no code implementations • 3 Dec 2020 • Wen Wang, Honglei Zhuang, Mi Zhou, Hanyu Liu, Beibei Li

Based on these insights, we then propose a hierarchical course BERT model to predict teachers' performance in online education.

Paper
Add Code

RRF102: Meeting the TREC-COVID Challenge with a 100+ Runs Ensemble

no code implementations • 1 Oct 2020 • Michael Bendersky, Honglei Zhuang, Ji Ma, Shuguang Han, Keith Hall, Ryan Mcdonald

In this paper, we report the results of our participation in the TREC-COVID challenge.

Retrieval Semantic Retrieval

Paper
Add Code

Adaptive Double-Exploration Tradeoff for Outlier Detection

no code implementations • 13 May 2020 • Xiaojin Zhang, Honglei Zhuang, Shengyu Zhang, Yuan Zhou

We study a variant of the thresholding bandit problem (TBP) in the context of outlier detection, where the objective is to identify the outliers whose rewards are above a threshold.

Outlier Detection

Paper
Add Code

Interpretable Learning-to-Rank with Generalized Additive Models

no code implementations • 6 May 2020 • Honglei Zhuang, Xuanhui Wang, Michael Bendersky, Alexander Grushetsky, Yonghui Wu, Petr Mitrichev, Ethan Sterling, Nathan Bell, Walker Ravina, Hai Qian

Interpretability of learning-to-rank models is a crucial yet relatively under-examined research area.

Additive models Interpretable Machine Learning +2

Paper
Add Code

Separate and Attend in Personal Email Search

no code implementations • 21 Nov 2019 • Yu Meng, Maryam Karimzadehgan, Honglei Zhuang, Donald Metzler

In personal email search, user queries often impose different requirements on different aspects of the retrieved emails.

Learning-To-Rank

Paper
Add Code

Spherical Text Embedding

1 code implementation • NeurIPS 2019 • Yu Meng, Jiaxin Huang, Guangyuan Wang, Chao Zhang, Honglei Zhuang, Lance Kaplan, Jiawei Han

While text embeddings are typically learned in the Euclidean space, directional similarity is often more effective in tasks such as word similarity and document clustering, which creates a gap between the training stage and usage stage of text embedding.

Clustering Riemannian optimization +1

175

Paper
Code

Identifying Outlier Arms in Multi-Armed Bandit

no code implementations • NeurIPS 2017 • Honglei Zhuang, Chi Wang, Yifan Wang

Outlier detection is a powerful method to narrow down the attention to a few objects after the data for them are collected.

Outlier Detection

Paper
Add Code

Identifying Semantically Deviating Outlier Documents

no code implementations • EMNLP 2017 • Honglei Zhuang, Chi Wang, Fangbo Tao, Lance Kaplan, Jiawei Han

A document outlier is a document that substantially deviates in semantics from the majority ones in a corpus.

Outlier Detection

Paper
Add Code

PReP: Path-Based Relevance from a Probabilistic Perspective in Heterogeneous Information Networks

no code implementations • 5 Jun 2017 • Yu Shi, Po-Wei Chan, Honglei Zhuang, Huan Gui, Jiawei Han

We also identify, from real-world data, and propose to model cross-meta-path synergy, which is a characteristic important for defining path-based HIN relevance and has not been modeled by existing methods.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.