Word Embeddings

1105 papers with code • 0 benchmarks • 52 datasets

Word embedding is the collective name for a set of language modeling and feature learning techniques in natural language processing (NLP) where words or phrases from the vocabulary are mapped to vectors of real numbers.

Techniques for learning word embeddings can include Word2Vec, GloVe, and other neural network-based approaches that train on an NLP task such as language modeling or document classification.

( Image credit: Dynamic Word Embedding for Evolving Semantic Discovery )

Bridging Vision and Language Spaces with Assignment Prediction

park-jungin/vlap 15 Apr 2024

This paper introduces VLAP, a novel approach that bridges pretrained vision models and large language models (LLMs) to make frozen LLMs understand the visual world.

4
15 Apr 2024

IITK at SemEval-2024 Task 1: Contrastive Learning and Autoencoders for Semantic Textual Relatedness in Multilingual Texts

faceonlive/ai-research 6 Apr 2024

This paper describes our system developed for the SemEval-2024 Task 1: Semantic Textual Relatedness.

139
06 Apr 2024

BanglaAutoKG: Automatic Bangla Knowledge Graph Construction with Semantic Neural Graph Filtering

azminewasi/banglaautokg 4 Apr 2024

Knowledge Graphs (KGs) have proven essential in information processing and reasoning applications because they link related entities and give context-rich information, supporting efficient information retrieval and knowledge discovery; presenting information flow in a very effective manner.

2
04 Apr 2024

Breaking the Silence Detecting and Mitigating Gendered Abuse in Hindi, Tamil, and Indian English Online Spaces

advaithavetagiri/cnlp-nits-pp 2 Apr 2024

Online gender-based harassment is a widespread issue limiting the free expression and participation of women and marginalized genders in digital spaces.

0
02 Apr 2024

DiLM: Distilling Dataset into Language Model for Text-level Dataset Distillation

arumaekawa/dilm 30 Mar 2024

To address this issue, we propose a novel text dataset distillation approach, called Distilling dataset into Language Model (DiLM), which trains a language model to generate informative synthetic training samples as text data, instead of directly optimizing synthetic samples.

4
30 Mar 2024

Debiasing Sentence Embedders through Contrastive Word Pairs

themody/debiasing-sentence-embedders-through-contrastive-word-pairs 27 Mar 2024

It is problematic that most debiasing approaches are directly transferred from word embeddings, therefore these approaches fail to take into account the nonlinear nature of sentence embedders and the embeddings they produce.

0
27 Mar 2024

SemRoDe: Macro Adversarial Training to Learn Representations That are Robust to Word-Level Attacks

aniloid2/semrode-macroadversarialtraining 27 Mar 2024

Language models (LMs) are indispensable tools for natural language processing tasks, but their vulnerability to adversarial attacks remains a concern.

0
27 Mar 2024

Projective Methods for Mitigating Gender Bias in Pre-trained Language Models

hillary-dawkins/genderswappedstereoset 27 Mar 2024

Mitigation of gender bias in NLP has a long history tied to debiasing static word embeddings.

0
27 Mar 2024

Prescribing Large Language Models for Perioperative Care: What's The Right Dose for Pre-trained Models?

cja5553/LLMs_in_perioperative_care 27 Feb 2024

Adapting models further improved performance: (1) self-supervised finetuning by 3. 2% for AUROC and 1. 5% for AUPRC; (2) semi-supervised finetuning by 1. 8% for AUROC and 2% for AUPRC, compared to self-supervised finetuning; (3) foundational modelling by 3. 6% for AUROC and 2. 6% for AUPRC, compared to self-supervised finetuning.

0
27 Feb 2024