Search Results for author: Sebastian Hofstätter

Found 26 papers, 18 papers with code

Rank-without-GPT: Building GPT-Independent Listwise Rerankers on Open-Source Large Language Models

no code implementations • 5 Dec 2023 • Xinyu Zhang, Sebastian Hofstätter, Patrick Lewis, Raphael Tang, Jimmy Lin

However, current works in this direction all depend on the GPT models, making it a single point of failure in scientific reproducibility.

Passage Retrieval Retrieval

Paper
Add Code

Annotating Data for Fine-Tuning a Neural Ranker? Current Active Learning Strategies are not Better than Random Selection

no code implementations • 12 Sep 2023 • Sophia Althammer, Guido Zuccon, Sebastian Hofstätter, Suzan Verberne, Allan Hanbury

We further find that gains provided by AL strategies come at the expense of more assessments (thus higher annotation costs) and AL strategies underperform random selection when comparing effectiveness given a fixed annotation cost.

Active Learning Domain Adaptation

Paper
Add Code

Ranger: A Toolkit for Effect-Size Based Multi-Task Evaluation

1 code implementation • 24 May 2023 • Mete Sertkan, Sophia Althammer, Sebastian Hofstätter

In this paper, we introduce Ranger - a toolkit to facilitate the easy use of effect-size-based meta-analysis for multi-task evaluation in NLP and IR.

Paper
Code

FiD-Light: Efficient and Effective Retrieval-Augmented Text Generation

no code implementations • 28 Sep 2022 • Sebastian Hofstätter, Jiecao Chen, Karthik Raman, Hamed Zamani

Retrieval-augmented generation models offer many benefits over standalone language models: besides a textual answer to a given query they provide provenance items retrieved from an updateable knowledge base.

Open-Domain Question Answering Re-Ranking +2

Paper
Add Code

TripJudge: A Relevance Judgement Test Collection for TripClick Health Retrieval

1 code implementation • 14 Aug 2022 • Sophia Althammer, Sebastian Hofstätter, Suzan Verberne, Allan Hanbury

Robust test collections are crucial for Information Retrieval research.

Information Retrieval Retrieval

Paper
Code

Multi-Task Retrieval-Augmented Text Generation with Relevance Sampling

no code implementations • 7 Jul 2022 • Sebastian Hofstätter, Jiecao Chen, Karthik Raman, Hamed Zamani

This paper studies multi-task training of retrieval-augmented generation models for knowledge-intensive tasks.

Open-Domain Question Answering Retrieval +1

Paper
Add Code

Are We There Yet? A Decision Framework for Replacing Term Based Retrieval with Dense Retrieval Systems

no code implementations • 26 Jun 2022 • Sebastian Hofstätter, Nick Craswell, Bhaskar Mitra, Hamed Zamani, Allan Hanbury

Recently, several dense retrieval (DR) models have demonstrated competitive performance to term-based retrieval that are ubiquitous in search systems.

Retrieval

Paper
Add Code

Introducing Neural Bag of Whole-Words with ColBERTer: Contextualized Late Interactions using Enhanced Reduction

no code implementations • 24 Mar 2022 • Sebastian Hofstätter, Omar Khattab, Sophia Althammer, Mete Sertkan, Allan Hanbury

Recent progress in neural information retrieval has demonstrated large gains in effectiveness, while often sacrificing the efficiency and interpretability of the neural model compared to classical approaches.

Information Retrieval Retrieval

Paper
Add Code

PARM: A Paragraph Aggregation Retrieval Model for Dense Document-to-Document Retrieval

1 code implementation • 5 Jan 2022 • Sophia Althammer, Sebastian Hofstätter, Mete Sertkan, Suzan Verberne, Allan Hanbury

However in the web domain we are in a setting with large amounts of training data and a query-to-passage or a query-to-document retrieval task.

Passage Retrieval Retrieval

Paper
Code

Establishing Strong Baselines for TripClick Health Retrieval

2 code implementations • 2 Jan 2022 • Sebastian Hofstätter, Sophia Althammer, Mete Sertkan, Allan Hanbury

We present strong Transformer-based re-ranking and dense retrieval baselines for the recently released TripClick health ad-hoc retrieval collection.

Re-Ranking Retrieval

Paper
Code

A Time-Optimized Content Creation Workflow for Remote Teaching

1 code implementation • 11 Oct 2021 • Sebastian Hofstätter, Sophia Althammer, Mete Sertkan, Allan Hanbury

We describe our workflow to create an engaging remote learning experience for a university course, while minimizing the post-production time of the educators.

561

Paper
Code

Linguistically Informed Masking for Representation Learning in the Patent Domain

1 code implementation • 10 Jun 2021 • Sophia Althammer, Mark Buckley, Sebastian Hofstätter, Allan Hanbury

Domain-specific contextualized language models have demonstrated substantial effectiveness gains for domain-specific downstream tasks, like similarity matching, entity recognition or information retrieval.

Domain Adaptation Information Retrieval +2

Paper
Code

Intra-Document Cascading: Learning to Select Passages for Neural Document Ranking

1 code implementation • 20 May 2021 • Sebastian Hofstätter, Bhaskar Mitra, Hamed Zamani, Nick Craswell, Allan Hanbury

An emerging recipe for achieving state-of-the-art effectiveness in neural document re-ranking involves utilizing large pre-trained language models - e. g., BERT - to evaluate all individual passages in the document and then aggregating the outputs by pooling or additional Transformer layers.

Document Ranking Knowledge Distillation +1

Paper
Code

Efficiently Teaching an Effective Dense Retriever with Balanced Topic Aware Sampling

4 code implementations • 14 Apr 2021 • Sebastian Hofstätter, Sheng-Chieh Lin, Jheng-Hong Yang, Jimmy Lin, Allan Hanbury

A vital step towards the widespread adoption of neural retrieval models is their resource efficiency throughout the training, indexing and query workflows.

Ranked #15 on Zero-shot Text Search on BEIR

Re-Ranking Retrieval +2

Paper
Code

Mitigating the Position Bias of Transformer Models in Passage Re-Ranking

1 code implementation • 18 Jan 2021 • Sebastian Hofstätter, Aldo Lipani, Sophia Althammer, Markus Zlabinger, Allan Hanbury

In this work we analyze position bias on datasets, the contextualized representations, and their effect on retrieval results.

Passage Re-Ranking Position +4

253

Paper
Code

Cross-domain Retrieval in the Legal and Patent Domains: a Reproducibility Study

1 code implementation • 21 Dec 2020 • Sophia Althammer, Sebastian Hofstätter, Allan Hanbury

For reproducibility and transparency as well as to benefit the community we make our source code and the trained models publicly available.

Information Retrieval Language Modelling +1

Paper
Code

Improving Efficient Neural Ranking Models with Cross-Architecture Knowledge Distillation

1 code implementation • 6 Oct 2020 • Sebastian Hofstätter, Sophia Althammer, Michael Schröder, Mete Sertkan, Allan Hanbury

Based on this finding, we propose a cross-architecture training procedure with a margin focused loss (Margin-MSE), that adapts knowledge distillation to the varying score output distributions of different BERT and non-BERT passage ranking architectures.

Knowledge Distillation Passage Ranking +3

102

Paper
Code

Fine-Grained Relevance Annotations for Multi-Task Document Ranking and Question Answering

1 code implementation • 12 Aug 2020 • Sebastian Hofstätter, Markus Zlabinger, Mete Sertkan, Michael Schröder, Allan Hanbury

We extend the ranked retrieval annotations of the Deep Learning track of TREC 2019 with passage and word level graded relevance annotations for all relevant documents.

Document Ranking Question Answering +1

Paper
Code

DEXA: Supporting Non-Expert Annotators with Dynamic Examples from Experts

1 code implementation • 17 May 2020 • Markus Zlabinger, Marta Sabou, Sebastian Hofstätter, Mete Sertkan, Allan Hanbury

of 0. 68 to experts in DEXA vs. 0. 40 in CONTROL); (ii) already three per majority voting aggregated annotations of the DEXA approach reach substantial agreements to experts of 0. 78/0. 75/0. 69 for P/I/O (in CONTROL 0. 73/0. 58/0. 46).

Avg Sentence +1

Paper
Code

Local Self-Attention over Long Text for Efficient Document Retrieval

1 code implementation • 11 May 2020 • Sebastian Hofstätter, Hamed Zamani, Bhaskar Mitra, Nick Craswell, Allan Hanbury

In this work, we propose a local self-attention which considers a moving window over the document terms and for each term attends only to other terms in the same window.

Document Ranking Retrieval

253

Paper
Code

Interpretable & Time-Budget-Constrained Contextualization for Re-Ranking

1 code implementation • 4 Feb 2020 • Sebastian Hofstätter, Markus Zlabinger, Allan Hanbury

In addition, to gain insight into TK, we perform a clustered query analysis of TK's results, highlighting its strengths and weaknesses on queries with different types of information need and we show how to interpret the cause of ranking differences of two documents by comparing their internal scores.

Re-Ranking Word Embeddings

253

Paper
Code

DSR: A Collection for the Evaluation of Graded Disease-Symptom Relations

no code implementations • 15 Jan 2020 • Markus Zlabinger, Sebastian Hofstätter, Navid Rekabsaz, Allan Hanbury

While existing disease-symptom relationship extraction methods are used as the foundation in the various medical tasks, no collection is available to systematically evaluate the performance of such methods.

Medical Diagnosis Word Embeddings

Paper
Add Code

Neural-IR-Explorer: A Content-Focused Tool to Explore Neural Re-Ranking Results

1 code implementation • 10 Dec 2019 • Sebastian Hofstätter, Markus Zlabinger, Allan Hanbury

In this paper we look beyond metrics-based evaluation of Information Retrieval systems, to explore the reasons behind ranking results.

Information Retrieval Re-Ranking +1

Paper
Code

TU Wien @ TREC Deep Learning '19 -- Simple Contextualization for Re-ranking

1 code implementation • 3 Dec 2019 • Sebastian Hofstätter, Markus Zlabinger, Allan Hanbury

The usage of neural network models puts multiple objectives in conflict with each other: Ideally we would like to create a neural model that is effective, efficient, and interpretable at the same time.

Document Ranking Passage Ranking +2

Paper
Code

Let's measure run time! Extending the IR replicability infrastructure to include performance aspects

no code implementations • 10 Jul 2019 • Sebastian Hofstätter, Allan Hanbury

Establishing a docker-based replicability infrastructure offers the community a great opportunity: measuring the run time of information retrieval systems.

Information Retrieval Re-Ranking +1

Paper
Add Code

On the Effect of Low-Frequency Terms on Neural-IR Models

1 code implementation • 29 Apr 2019 • Sebastian Hofstätter, Navid Rekabsaz, Carsten Eickhoff, Allan Hanbury

Low-frequency terms are a recurring challenge for information retrieval models, especially neural IR frameworks struggle with adequately capturing infrequently observed words.

Passage Retrieval Retrieval +1

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.