Only Connect Walls Dataset Task 1 (Grouping)

10 papers with code • 1 benchmarks • 1 datasets

Split data into groups, taking into account knowledge in the form of constraints on points, groups of points, or clusters.

Libraries

Use these libraries to find Only Connect Walls Dataset Task 1 (Grouping) models and implementations

Datasets


Most implemented papers

RoBERTa: A Robustly Optimized BERT Pretraining Approach

pytorch/fairseq 26 Jul 2019

Language model pretraining has led to significant performance gains but careful comparison between different approaches is challenging.

Deep contextualized word representations

flairNLP/flair NAACL 2018

We introduce a new type of deep contextualized word representation that models both (1) complex characteristics of word use (e. g., syntax and semantics), and (2) how these uses vary across linguistic contexts (i. e., to model polysemy).

DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter

huggingface/transformers NeurIPS 2019

As Transfer Learning from large-scale pre-trained models becomes more prevalent in Natural Language Processing (NLP), operating these large models in on-the-edge and/or under constrained computational training or inference budgets remains challenging.

GPT-4 Technical Report

openai/evals Preprint 2023

We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs.

MPNet: Masked and Permuted Pre-training for Language Understanding

microsoft/MPNet NeurIPS 2020

Since BERT neglects dependency among predicted tokens, XLNet introduces permuted language modeling (PLM) for pre-training to address this problem.

Learning Word Vectors for 157 Languages

dzieciou/lemmatizer-pl LREC 2018

Distributed word representations, or word vectors, have recently been applied to many tasks in natural language processing, leading to state-of-the-art performance.

Pre-Training of Deep Bidirectional Protein Sequence Representations with Structural Information

mswzeus/PLUS 25 Nov 2019

Bridging the exponentially growing gap between the numbers of unlabeled and labeled protein sequences, several studies adopted semi-supervised learning for protein sequence modeling.

Text Embeddings by Weakly-Supervised Contrastive Pre-training

microsoft/unilm 7 Dec 2022

This paper presents E5, a family of state-of-the-art text embeddings that transfer well to a wide range of tasks.

Large Language Models are Fixated by Red Herrings: Exploring Creative Problem Solving and Einstellung Effect using the Only Connect Wall Dataset

taatiteam/ocw NeurIPS 2023

In this paper we present the novel Only Connect Wall (OCW) dataset and report results from our evaluation of selected pre-trained language models and LLMs on creative problem solving tasks like grouping clue words by heterogeneous connections, and identifying correct open knowledge domain connections in respective groups.