Word Embeddings

1108 papers with code • 0 benchmarks • 52 datasets

Word embedding is the collective name for a set of language modeling and feature learning techniques in natural language processing (NLP) where words or phrases from the vocabulary are mapped to vectors of real numbers.

Techniques for learning word embeddings can include Word2Vec, GloVe, and other neural network-based approaches that train on an NLP task such as language modeling or document classification.

( Image credit: Dynamic Word Embedding for Evolving Semantic Discovery )

Latest papers with no code

Improving Acoustic Word Embeddings through Correspondence Training of Self-supervised Speech Representations

no code yet • 13 Mar 2024

HuBERT-based CAE model achieves the best results for word discrimination in all languages, despite Hu-BERT being pre-trained on English only.

Identifying and interpreting non-aligned human conceptual representations using language modeling

no code yet • 10 Mar 2024

In applying this method, we show that congenital blindness induces conceptual reorganization in both a-modal and sensory-related verbal domains, and we identify the associated semantic shifts.

VNLP: Turkish NLP Package

no code yet • 2 Mar 2024

In this work, we present VNLP: the first dedicated, complete, open-source, well-documented, lightweight, production-ready, state-of-the-art Natural Language Processing (NLP) package for the Turkish language.

Learning Intrinsic Dimension via Information Bottleneck for Explainable Aspect-based Sentiment Analysis

no code yet • 28 Feb 2024

To address this, we propose the Information Bottleneck-based Gradient (\texttt{IBG}) explanation framework for ABSA.

Enhancing Modern Supervised Word Sense Disambiguation Models by Semantic Lexical Resources

no code yet • LREC 2018

Supervised models for Word Sense Disambiguation (WSD) currently yield to state-of-the-art results in the most popular benchmarks.

Ontology Enhanced Claim Detection

no code yet • 19 Feb 2024

We fused ontology embeddings from a knowledge base with BERT sentence embeddings to perform claim detection for the ClaimBuster and the NewsClaims datasets.

From Prejudice to Parity: A New Approach to Debiasing Large Language Model Word Embeddings

no code yet • 18 Feb 2024

Embeddings play a pivotal role in the efficacy of Large Language Models.

Word Embeddings Revisited: Do LLMs Offer Something New?

no code yet • 16 Feb 2024

Learning meaningful word embeddings is key to training a robust language model.

Injecting Wiktionary to improve token-level contextual representations using contrastive learning

no code yet • 12 Feb 2024

We also propose two new WiC test sets for which we show that our fine-tuning method achieves substantial improvements.

Empowering machine learning models with contextual knowledge for enhancing the detection of eating disorders in social media posts

no code yet • 8 Feb 2024

We tested our approach on a dataset of 2, 000 tweets about eating disorders, finding that merging word embeddings with knowledge graph information enhances the predictive models' reliability.