Speaker Identification

61 papers with code • 4 benchmarks • 4 datasets

This task has no description! Would you like to contribute one?

SIG: Speaker Identification in Literature via Prompt-Based Generation

sumafuture/SIG 22 Dec 2023

Identifying speakers of quotations in narratives is an important task in literary analysis, with challenging scenarios including the out-of-domain inference for unseen speakers, and non-explicit cases where there are no speaker mentions in surrounding context.

1
22 Dec 2023

InstructERC: Reforming Emotion Recognition in Conversation with a Retrieval Multi-task LLMs Framework

LIN-SHANG/InstructERC 21 Sep 2023

The field of emotion recognition of conversation (ERC) has been focusing on separating sentence feature encoding and context modeling, lacking exploration in generative paradigms based on unified designs.

98
21 Sep 2023

An Effective Transformer-based Contextual Model and Temporal Gate Pooling for Speaker Identification

harunorikawano/speaker-identification-with-tgp 22 Aug 2023

Wav2vec2 has achieved success in applying Transformer architecture and self-supervised learning to speech recognition.

4
22 Aug 2023

Gammatonegram Representation for End-to-End Dysarthric Speech Processing Tasks: Speech Recognition, Speaker Identification, and Intelligibility Assessment

areffarhadi/gammatonegram_cnn_dysarthric_speech 6 Jul 2023

Dysarthria is a disability that causes a disturbance in the human speech system and reduces the quality and intelligibility of a person's speech.

1
06 Jul 2023

Non-uniform Speaker Disentanglement For Depression Detection From Raw Speech Signals

kingformatty/NUSD 2 Jun 2023

We find that a greater adversarial weight for the initial layers leads to performance improvement.

11
02 Jun 2023

MPCHAT: Towards Multimodal Persona-Grounded Conversation

ahnjaewoo/mpchat 27 May 2023

In order to build self-consistent personalized dialogue agents, previous research has mostly focused on textual persona that delivers personal facts or personalities.

20
27 May 2023

GIFT: Graph-Induced Fine-Tuning for Multi-Party Conversation Understanding

JasonForJoy/MPC-BERT 16 May 2023

Addressing the issues of who saying what to whom in multi-party conversations (MPCs) has recently attracted a lot of research attention.

36
16 May 2023

Unsupervised Speech Representation Pooling Using Vector Quantization

IIP-Sogang/speech-pooling-benchmark 8 Apr 2023

However, the pooling problem remains; the length of speech representations is inherently variable.

4
08 Apr 2023

ASiT: Local-Global Audio Spectrogram vIsion Transformer for Event Classification

sara-ahmed/asit 23 Nov 2022

Transformers, which were originally developed for natural language processing, have recently generated significant interest in the computer vision and audio communities due to their flexibility in learning long-range relationships.

17
23 Nov 2022

MelHuBERT: A simplified HuBERT on Mel spectrograms

nervjack2/melhubert 17 Nov 2022

Self-supervised models have had great success in learning speech representations that can generalize to various downstream tasks.

53
17 Nov 2022