Knowledge Probing

21 papers with code • 6 benchmarks • 3 datasets

This task has no description! Would you like to contribute one?

Benchmarks

Add a Result

These leaderboards are used to track progress in Knowledge Probing

Dataset	Best Model	Compare
Latex Equations	GAL 120B (zero-shot)	See all
AminoProbe	GAL 120B (zero-shot)	See all
BioLAMA	BLOOM (zero-shot)	See all
Chemical Reactions	GAL 120B	See all
Galaxy Clusters	GAL 120B	See all
Mineral Groups	GAL 120B	See all

Datasets

Most implemented papers

Most implemented Social Latest No code

Calibrating Factual Knowledge in Pretrained Language Models

dqxiu/calinet • • 7 Oct 2022

However, we find that facts stored in the PLMs are not always correct.

Paper
Code

COPEN: Probing Conceptual Knowledge in Pre-trained Language Models

thu-keg/copen • • 8 Nov 2022

We believe this is a critical bottleneck for realizing human-like cognition in PLMs.

Paper
Code

Galactica: A Large Language Model for Science

paperswithcode/galai • • 16 Nov 2022

We believe these results demonstrate the potential for language models as a new interface for science.

Paper
Code

Injecting Domain Knowledge in Language Models for Task-Oriented Dialogue Systems

amazon-research/domain-knowledge-injection • • 15 Dec 2022

Pre-trained language models (PLM) have advanced the state-of-the-art across NLP applications, but lack domain-specific knowledge that does not naturally occur in pre-training data.

Paper
Code

When Not to Trust Language Models: Investigating Effectiveness of Parametric and Non-Parametric Memories

alextmallen/adaptive-retrieval • • 20 Dec 2022

Despite their impressive performance on diverse tasks, large language models (LMs) still struggle with tasks requiring rich world knowledge, implying the limitations of relying solely on their parameters to encode a wealth of world knowledge.

Paper
Code

Is BERT Blind? Exploring the Effect of Vision-and-Language Pretraining on Visual Language Understanding

TAU-VAILab/isbertblind • • CVPR 2023

We show that SOTA multimodally trained text encoders outperform unimodally trained text encoders on the VLU tasks while being underperformed by them on the NLU tasks, lending new context to previously mixed results regarding the NLU capabilities of multimodal models.

Paper
Code

LeXFiles and LegalLAMA: Facilitating English Multinational Legal Language Model Development

coastalcph/lexlms • • 12 May 2023

To this end, we release a multinational English legal corpus (LeXFiles) and a legal knowledge probing benchmark (LegalLAMA) to facilitate training and detailed analysis of legal-oriented PLMs.

Paper
Code

Using Large Language Models for Knowledge Engineering (LLMKE): A Case Study on Wikidata

bohuizhang/llmke • 15 Sep 2023

In this work, we explore the use of Large Language Models (LLMs) for knowledge engineering tasks in the context of the ISWC 2023 LM-KBC Challenge.

Paper
Code

Assessing the Reliability of Large Language Model Knowledge

vicky-wil/monitor • • 15 Oct 2023

Large language models (LLMs) have been treated as knowledge bases due to their strong performance in knowledge probing tasks.

Paper
Code

PromptCBLUE: A Chinese Prompt Tuning Benchmark for the Medical Domain

michael-wzhu/PromptCBLUE • • 22 Oct 2023

Biomedical language understanding benchmarks are the driving forces for artificial intelligence applications with large language model (LLM) back-ends.

Paper
Code

Knowledge Probing

Benchmarks Add a Result

Datasets

Most implemented papers

Content

Benchmarks

Add a Result