Zero-shot Text Search

13 papers with code • 18 benchmarks • 15 datasets

This task has no description! Would you like to contribute one?

Libraries

Use these libraries to find Zero-shot Text Search models and implementations

Most implemented papers

Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval

microsoft/ANCE ICLR 2021

In this paper, we identify that the main bottleneck is in the training mechanisms, where the negative instances used in training are not representative of the irrelevant documents in testing.

GPL: Generative Pseudo Labeling for Unsupervised Domain Adaptation of Dense Retrieval

ukplab/gpl NAACL 2022

This limits the usage of dense retrieval approaches to only a few domains with large training datasets.

Efficiently Teaching an Effective Dense Retriever with Balanced Topic Aware Sampling

sebastian-hofstaetter/tas-balanced-dense-retrieval 14 Apr 2021

A vital step towards the widespread adoption of neural retrieval models is their resource efficiency throughout the training, indexing and query workflows.

ColBERTv2: Effective and Efficient Retrieval via Lightweight Late Interaction

stanford-futuredata/ColBERT NAACL 2022

Neural information retrieval (IR) has greatly advanced search and other knowledge-intensive language tasks.

Large Dual Encoders Are Generalizable Retrievers

google-research/t5x_retrieval 15 Dec 2021

With multi-stage training, surprisingly, scaling up the model size brings significant improvement on a variety of retrieval tasks, especially for out-of-domain generalization.

Zero-Shot Question Generation from Knowledge Graphs for Unseen Predicates and Entity Types

NAACL2018Anonymous/submission NAACL 2018

We present a neural model for question generation from knowledge base triples in a "Zero-Shot" setup, that is generating questions for triples containing predicates, subject types or object types that were not seen at training time.

MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers

microsoft/unilm NeurIPS 2020

The small model (student) is trained by deeply mimicking the self-attention module, which plays a vital role in Transformer networks, of the large model (teacher).

SPLADE v2: Sparse Lexical and Expansion Model for Information Retrieval

naver/splade 21 Sep 2021

Meanwhile, there has been a growing interest in learning \emph{sparse} representations for documents and queries, that could inherit from the desirable properties of bag-of-words models such as the exact matching of terms and the efficiency of inverted indexes.

Text and Code Embeddings by Contrastive Pre-Training

openmatch/coco-dr 24 Jan 2022

Similarly to text embeddings, we train code embedding models on (text, code) pairs, obtaining a 20. 8% relative improvement over prior best work on code search.

SGPT: GPT Sentence Embeddings for Semantic Search

muennighoff/sgpt 17 Feb 2022

To this end, we propose SGPT to use decoders for sentence embeddings and semantic search via prompting or fine-tuning.