Search Results for author: Okko Räsänen

Found 23 papers, 13 papers with code

Crowdsourcing and Evaluating Text-Based Audio Retrieval Relevances

1 code implementation • 16 Jun 2023 • Huang Xie, Khazar Khorrami, Okko Räsänen, Tuomas Virtanen

Conversely, the results suggest that using only binary relevances defined by captioning-based audio-caption pairs is sufficient for contrastive learning.

Audio captioning Contrastive Learning +1

Paper
Code

Simultaneous or Sequential Training? How Speech Representations Cooperate in a Multi-Task Self-Supervised Learning System

no code implementations • 5 Jun 2023 • Khazar Khorrami, María Andrea Cruz Blandón, Tuomas Virtanen, Okko Räsänen

As a result, we find that sequential training with wav2vec 2. 0 first and VGS next provides higher performance on audio-visual retrieval compared to simultaneous optimization of both learning mechanisms.

Multi-Task Learning Representation Learning +3

Paper
Add Code

BabySLM: language-acquisition-friendly benchmark of self-supervised spoken language models

1 code implementation • 2 Jun 2023 • Marvin Lavechin, Yaya Sy, Hadrien Titeux, María Andrea Cruz Blandón, Okko Räsänen, Hervé Bredin, Emmanuel Dupoux, Alejandrina Cristia

Self-supervised techniques for learning speech representations have been shown to develop linguistic competence from exposure to speech without the need for human labels.

Benchmarking Language Acquisition

Paper
Code

Syllable Discovery and Cross-Lingual Generalization in a Visually Grounded, Self-Supervised Speech Model

2 code implementations • 19 May 2023 • Puyuan Peng, Shang-Wen Li, Okko Räsänen, Abdelrahman Mohamed, David Harwath

In this paper, we show that representations capturing syllabic units emerge when training a self-supervised speech model with a visually-grounded training objective.

Language Modelling Masked Language Modeling +3

Paper
Code

Evaluation of self-supervised pre-training for automatic infant movement classification using wearable movement sensors

1 code implementation • 16 May 2023 • Einari Vaaras, Manu Airaksinen, Sampsa Vanhatalo, Okko Räsänen

The recently-developed infant wearable MAIJU provides a means to automatically evaluate infants' motor performance in an objective and scalable manner in out-of-hospital settings.

Human Activity Recognition Self-Supervised Learning

Paper
Code

Analysing the Impact of Audio Quality on the Use of Naturalistic Long-Form Recordings for Infant-Directed Speech Research

1 code implementation • 3 May 2023 • María Andrea Cruz Blandón, Alejandrina Cristia, Okko Räsänen

Our results show that the use of modest and high audio quality naturalistic speech data result in largely similar conclusions on IDS and ADS in terms of acoustic analyses and modelling experiments.

Language Acquisition Self-Supervised Learning

Paper
Code

On Negative Sampling for Contrastive Audio-Text Retrieval

no code implementations • 8 Nov 2022 • Huang Xie, Okko Räsänen, Tuomas Virtanen

With a constant training setting on the retrieval system from [1], we study eight sampling strategies, including hard and semi-hard negative sampling.

Audio to Text Retrieval Contrastive Learning +2

Paper
Add Code

Analysis of Self-Supervised Learning and Dimensionality Reduction Methods in Clustering-Based Active Learning for Speech Emotion Recognition

1 code implementation • 21 Jun 2022 • Einari Vaaras, Manu Airaksinen, Okko Räsänen

In this paper, we combine CPC and multiple dimensionality reduction methods in search of functioning practices for clustering-based AL. Our experiments for simulating speech emotion recognition system deployment show that both the local and global topology of the feature space can be successfully used for AL, and that CPC can be used to improve clustering-based AL performance over traditional signal features.

Active Learning Clustering +3

Paper
Code

Towards Learning to Speak and Hear Through Multi-Agent Communication over a Continuous Acoustic Channel

no code implementations • 4 Nov 2021 • Kevin Eloff, Okko Räsänen, Herman A. Engelbrecht, Arnu Pretorius, Herman Kamper

Multi-agent reinforcement learning has been used as an effective means to study emergent communication between agents, yet little focus has been given to continuous acoustic communication.

Language Acquisition Multi-agent Reinforcement Learning +3

Paper
Add Code

Unsupervised Audio-Caption Aligning Learns Correspondences between Individual Sound Events and Textual Phrases

1 code implementation • 6 Oct 2021 • Huang Xie, Okko Räsänen, Konstantinos Drossos, Tuomas Virtanen

We investigate unsupervised learning of correspondences between sound events and textual phrases through aligning audio clips with textual captions describing the content of a whole audio clip.

Event Detection Retrieval +1

Paper
Code

Can phones, syllables, and words emerge as side-products of cross-situational audiovisual learning? -- A computational investigation

1 code implementation • 29 Sep 2021 • Khazar Khorrami, Okko Räsänen

We review the extent that the audiovisual aspect of LLH is supported by the existing computational studies.

Representation Learning

Paper
Code

Language-Independent Approach for Automatic Computation of Vowel Articulation Features in Dysarthric Speech Assessment

1 code implementation • 16 Aug 2021 • Yuanyuan Liu, Nelly Penttilä, Tiina Ihalainen, Juulia Lintula, Rachel Convey, Okko Räsänen

Experimental results on a Finnish PD speech corpus demonstrate the efficacy and reliability of the proposed automatic method in deriving VAI, VSA, FCR and F2i/F2u (the second formant ratio for vowels /i/ and /u/).

Paper
Code

ZR-2021VG: Zero-Resource Speech Challenge, Visually-Grounded Language Modelling track, 2021 edition

1 code implementation • 14 Jul 2021 • Afra Alishahia, Grzegorz Chrupała, Alejandrina Cristia, Emmanuel Dupoux, Bertrand Higy, Marvin Lavechin, Okko Räsänen, Chen Yu

We present the visually-grounded language modelling track that was introduced in the Zero-Resource Speech challenge, 2021 edition, 2nd round.

Language Modelling

Paper
Code

Evaluation of Audio-Visual Alignments in Visually Grounded Speech Models

1 code implementation • 5 Jul 2021 • Khazar Khorrami, Okko Räsänen

We compare the alignment performance using our proposed evaluation metrics to the semantic retrieval task commonly used to evaluate VGS models.

Cross-Modal Retrieval Object Localization +2

Paper
Code

Comparison of end-to-end neural network architectures and data augmentation methods for automatic infant motility assessment using wearable sensors

no code implementations • 2 Jul 2021 • Manu Airaksinen, Sampsa Vanhatalo, Okko Räsänen

In addition, we explore the benefits of data augmentation methods in ideal and non-ideal recording conditions.

Data Augmentation Time Series +1

Paper
Add Code

Automatic Analysis of the Emotional Content of Speech in Daylong Child-Centered Recordings from a Neonatal Intensive Care Unit

no code implementations • 14 Jun 2021 • Einari Vaaras, Sari Ahlqvist-Björkroth, Konstantinos Drossos, Okko Räsänen

Researchers have recently started to study how the emotional speech heard by young infants can affect their developmental outcomes.

Active Learning Binary Classification +3

Paper
Add Code

Zero-Shot Audio Classification with Factored Linear and Nonlinear Acoustic-Semantic Projections

no code implementations • 25 Nov 2020 • Huang Xie, Okko Räsänen, Tuomas Virtanen

In this paper, we study zero-shot learning in audio classification through factored linear and nonlinear acoustic-semantic projections between audio instances and sound classes.

Audio Classification General Classification +2

Paper
Add Code

Unsupervised Discovery of Recurring Speech Patterns Using Probabilistic Adaptive Metrics

2 code implementations • 3 Aug 2020 • Okko Räsänen, María Andrea Cruz Blandón

One potential approach to this problem is to use dynamic time warping (DTW) to find well-aligning patterns from the speech data.

Dynamic Time Warping

Paper
Code

Analysis of Predictive Coding Models for Phonemic Representation Learning in Small Datasets

no code implementations • 8 Jul 2020 • María Andrea Cruz Blandón, Okko Räsänen

The present study investigates the behaviour of two predictive coding models, Autoregressive Predictive Coding and Contrastive Predictive Coding, in a phoneme discrimination task (ABX task) for two languages with different dataset sizes.

Language Acquisition Representation Learning

Paper
Add Code

Automatic Posture and Movement Tracking of Infants with Wearable Movement Sensors

no code implementations • 21 Sep 2019 • Manu Airaksinen, Okko Räsänen, Elina Ilén, Taru Häyrinen, Anna Kivi, Viviana Marchi, Anastasia Gallen, Sonja Blom, Anni Varhe, Nico Kaartinen, Leena Haataja, Sampsa Vanhatalo

These data were manually annotated for infant posture and movement based on video recordings of the sessions, and using a novel annotation scheme specifically designed to assess the overall movement pattern of infants in the given age group.

Paper
Add Code

SylNet: An Adaptable End-to-End Syllable Count Estimator for Speech

1 code implementation • 24 Jun 2019 • Shreyas Seshadri, Okko Räsänen

Automatic syllable count estimation (SCE) is used in a variety of applications ranging from speaking rate estimation to detecting social activity from wearable microphones or developmental research concerned with quantifying speech heard by language-learning children in different environments.

Paper
Code

A computational model of early language acquisition from audiovisual experiences of young infants

no code implementations • 24 Jun 2019 • Okko Räsänen, Khazar Khorrami

Earlier research has suggested that human infants might use statistical dependencies between speech and non-linguistic multimodal input to bootstrap their language learning before they know how to segment words from running speech.

Language Acquisition

Paper
Add Code

Blind phoneme segmentation with temporal prediction errors

no code implementations • ACL 2017 • Paul Michel, Okko Räsänen, Roland Thiollière, Emmanuel Dupoux

Phonemic segmentation of speech is a critical step of speech recognition systems.

speech-recognition Speech Recognition

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.