Search Results for author: Juhan Nam

Found 41 papers, 22 papers with code

Towards Efficient and Real-Time Piano Transcription Using Neural Autoregressive Models

no code implementations • 10 Apr 2024 • Taegyun Kwon, Dasaem Jeong, Juhan Nam

To this end, we propose novel architectures for convolutional recurrent neural networks, redesigning an existing autoregressive piano transcription model.

Paper
Add Code

Expressive Acoustic Guitar Sound Synthesis with an Instrument-Specific Input Representation and Diffusion Outpainting

no code implementations • 24 Jan 2024 • Hounsu Kim, Soonbeom Choi, Juhan Nam

Synthesizing performing guitar sound is a highly challenging task due to the polyphony and high variability in expression.

Paper
Add Code

T-FOLEY: A Controllable Waveform-Domain Diffusion Model for Temporal-Event-Guided Foley Sound Synthesis

no code implementations • 17 Jan 2024 • Yoonjin Chung, Junwon Lee, Juhan Nam

T-Foley generates high-quality audio using two conditions: the sound class and temporal event feature.

Paper
Add Code

A Real-Time Lyrics Alignment System Using Chroma And Phonetic Features For Classical Vocal Performance

no code implementations • 17 Jan 2024 • Jiyun Park, Sangeon Yong, Taegyun Kwon, Juhan Nam

The goal of real-time lyrics alignment is to take live singing audio as input and to pinpoint the exact position within given lyrics on the fly.

Paper
Add Code

The Song Describer Dataset: a Corpus of Audio Captions for Music-and-Language Evaluation

1 code implementation • 16 Nov 2023 • Ilaria Manco, Benno Weck, Seungheon Doh, Minz Won, Yixiao Zhang, Dmitry Bogdanov, Yusong Wu, Ke Chen, Philip Tovstogan, Emmanouil Benetos, Elio Quinton, György Fazekas, Juhan Nam

We introduce the Song Describer dataset (SDD), a new crowdsourced corpus of high-quality audio-caption pairs, designed for the evaluation of music-and-language models.

Music Captioning Music Generation +2

115

Paper
Code

VoiceLDM: Text-to-Speech with Environmental Context

no code implementations • 24 Sep 2023 • Yeonghyeon Lee, Inmo Yeon, Juhan Nam, Joon Son Chung

This paper presents VoiceLDM, a model designed to produce audio that accurately follows two distinct natural language text prompts: the description prompt and the content prompt.

AudioCaps

Paper
Add Code

K-pop Lyric Translation: Dataset, Analysis, and Neural-Modelling

2 code implementations • 20 Sep 2023 • Haven Kim, Jongmin Jung, Dasaem Jeong, Juhan Nam

To broaden the scope of genres and languages in lyric translation studies, we introduce a novel singable lyric translation dataset, approximately 89\% of which consists of K-pop song lyrics.

Translation

Paper
Code

A Computational Evaluation Framework for Singable Lyric Translation

no code implementations • 26 Aug 2023 • Haven Kim, Kento Watanabe, Masataka Goto, Juhan Nam

Lyric translation plays a pivotal role in amplifying the global resonance of music, bridging cultural divides, and fostering universal connections.

Paper
Add Code

All-In-One Metrical And Functional Structure Analysis With Neighborhood Attentions on Demixed Audio

1 code implementation • 31 Jul 2023 • Taejun Kim, Juhan Nam

Music is characterized by complex hierarchical structures.

Information Retrieval Music Information Retrieval +1

323

Paper
Code

LP-MusicCaps: LLM-Based Pseudo Music Captioning

1 code implementation • 31 Jul 2023 • Seungheon Doh, Keunwoo Choi, Jongpil Lee, Juhan Nam

In addition, we trained a transformer-based music captioning model with the dataset and evaluated it under zero-shot and transfer-learning settings.

Language Modelling Large Language Model +3

220

Paper
Code

PrimaDNN': A Characteristics-aware DNN Customization for Singing Technique Detection

no code implementations • 25 Jun 2023 • Yuya Yamamoto, Juhan Nam, Hiroko Terasawa

Automatic detection of singing techniques from audio tracks can be beneficial to understand how each singer expresses the performance, yet it can also be difficult due to the wide variety of the singing techniques.

Paper
Add Code

A Phoneme-Informed Neural Network Model for Note-Level Singing Transcription

no code implementations • 12 Apr 2023 • Sangeon Yong, Li Su, Juhan Nam

Note-level automatic music transcription is one of the most representative music information retrieval (MIR) tasks and has been studied for various instruments to understand music.

Information Retrieval Music Information Retrieval +2

Paper
Add Code

Textless Speech-to-Music Retrieval Using Emotion Similarity

no code implementations • 19 Mar 2023 • Seungheon Doh, Minz Won, Keunwoo Choi, Juhan Nam

We introduce a framework that recommends music based on the emotions of speech.

Retrieval

Paper
Add Code

Music Playlist Title Generation Using Artist Information

1 code implementation • 14 Jan 2023 • Haven Kim, Seungheon Doh, Junwon Lee, Juhan Nam

Automatically generating or captioning music playlist titles given a set of tracks is of significant interest in music streaming services as customized playlists are widely used in personalized music recommendation, and well-composed text titles attract users and help their music discovery.

Decoder Music Recommendation

Paper
Code

Toward Universal Text-to-Music Retrieval

3 code implementations • 26 Nov 2022 • Seungheon Doh, Minz Won, Keunwoo Choi, Juhan Nam

This paper introduces effective design choices for text-to-music retrieval systems.

Music Classification Retrieval +2

107

Paper
Code

YM2413-MDB: A Multi-Instrumental FM Video Game Music Dataset with Emotion Annotations

1 code implementation • 14 Nov 2022 • Eunjin Choi, Yoonjin Chung, Seolhee Lee, JongIk Jeon, Taegyun Kwon, Juhan Nam

In addition, they generally lack high-level annotations such as emotion tags.

Emotion Recognition Music Generation

Paper
Code

Analysis and Detection of Singing Techniques in Repertoires of J-POP Solo Singers

no code implementations • 31 Oct 2022 • Yuya Yamamoto, Juhan Nam, Hiroko Terasawa

In this paper, we focus on singing techniques within the scope of music information retrieval research.

Descriptive Information Retrieval +2

Paper
Add Code

Pseudo-Label Transfer from Frame-Level to Note-Level in a Teacher-Student Framework for Singing Transcription from Polyphonic Music

1 code implementation • 25 Mar 2022 • Sangeun Kum, Jongpil Lee, Keunhyoung Luke Kim, Taehyoung Kim, Juhan Nam

We address the issue by using pseudo labels from vocal pitch estimation models given unlabeled data.

Pseudo Label Quantization

132

Paper
Code

A Melody-Unsupervision Model for Singing Voice Synthesis

no code implementations • 13 Oct 2021 • Soonbeom Choi, Juhan Nam

We also show that the proposed model is capable of being trained with speech audio and text labels but can generate singing voice in inference time.

Singing Voice Synthesis

Paper
Add Code

Music Playlist Title Generation: A Machine-Translation Approach

no code implementations • NLP4MusA 2021 • Seungheon Doh, Junwon Lee, Juhan Nam

We propose a machine-translation approach to automatically generate a playlist title from a set of music tracks.

Data Augmentation Machine Translation +1

Paper
Add Code

Polyphonic Piano Transcription Using Autoregressive Multi-State Note Model

no code implementations • 2 Oct 2020 • Taegyun Kwon, Dasaem Jeong, Juhan Nam

Recent advances in polyphonic piano transcription have been made primarily by a deliberate design of neural network architectures that detect different note states such as onset or sustain and model the temporal evolution of the states.

Paper
Add Code

A Computational Analysis of Real-World DJ Mixes using Mix-To-Track Subsequence Alignment

1 code implementation • 24 Aug 2020 • Taejun Kim, Minsuk Choi, Evan Sacks, Yi-Hsuan Yang, Juhan Nam

A DJ mix is a sequence of music tracks concatenated seamlessly, typically rendered for audiences in a live setting by a DJ on stage.

Paper
Code

Metric Learning vs Classification for Disentangled Music Representation Learning

no code implementations • 9 Aug 2020 • Jongpil Lee, Nicholas J. Bryan, Justin Salamon, Zeyu Jin, Juhan Nam

For this, we (1) outline past work on the relationship between metric learning and classification, (2) extend this relationship to multi-label data by exploring three different learning approaches and their disentangled versions, and (3) evaluate all models on four tasks (training time, similarity retrieval, auto-tagging, and triplet prediction).

Classification Disentanglement +6

Paper
Add Code

Disentangled Multidimensional Metric Learning for Music Similarity

no code implementations • 9 Aug 2020 • Jongpil Lee, Nicholas J. Bryan, Justin Salamon, Zeyu Jin, Juhan Nam

For this task, it is typically necessary to define a similarity metric to compare one recording to another.

Metric Learning Specificity +1

Paper
Add Code

Musical Word Embedding: Bridging the Gap between Listening Contexts and Music

no code implementations • 23 Jul 2020 • Seungheon Doh, Jongpil Lee, Tae Hong Park, Juhan Nam

Word embedding pioneered by Mikolov et al. is a staple technique for word representations in natural language processing (NLP) research which has also found popularity in music information retrieval tasks.

Information Retrieval Music Information Retrieval +1

Paper
Add Code

VirtuosoNet: A Hierarchical RNN-based System for Modeling Expressive Piano Performance

1 code implementation • ISMIR 2019 • Dasaem Jeong, Taegyun Kwon, Yoojin Kim, Kyogu Lee, Juhan Nam

In this paper, we present our application of deep neural network to modeling piano performance, which imitates the expressive control of tempo, dynamics, articulations and pedaling from pianists.

Music Performance Rendering

Paper
Code

Temporal Feedback Convolutional Recurrent Neural Networks for Speech Command Recognition

1 code implementation • 30 Oct 2019 • Taejun Kim, Juhan Nam

End-to-end learning models using raw waveforms as input have shown superior performances in many audio recognition tasks.

Keyword Spotting

Paper
Code

Zero-shot Learning for Audio-based Music Classification and Tagging

1 code implementation • 5 Jul 2019 • Jeong Choi, Jongpil Lee, Jiyoung Park, Juhan Nam

Audio-based music classification and tagging is typically based on categorical supervised learning with a fixed set of labels.

Attribute Classification +5

Paper
Code

Representation Learning of Music Using Artist, Album, and Track Information

no code implementations • 27 Jun 2019 • Jongpil Lee, Jiyoung Park, Juhan Nam

Supervised music representation learning has been performed mainly using semantic labels such as music genres.

Representation Learning

Paper
Add Code

Learning a Joint Embedding Space of Monophonic and Mixed Music Signals for Singing Voice

1 code implementation • 26 Jun 2019 • Kyungyun Lee, Juhan Nam

We show the effectiveness of our system for singer identification and query-by-singer in both the same-domain and cross-domain tasks.

Sound Audio and Speech Processing

Paper
Code

Zero-shot Learning and Knowledge Transfer in Music Classification and Tagging

no code implementations • 20 Jun 2019 • Jeong Choi, Jongpil Lee, Jiyoung Park, Juhan Nam

Music classification and tagging is conducted through categorical supervised learning with a fixed set of labels.

Classification General Classification +3

Paper
Add Code

Graph Neural Network for Music Score Data and Modeling Expressive Piano Performance

1 code implementation • ICML 2019 • Dasaem Jeong, Taegyun Kwon, Yoojin Kim, Juhan Nam

Music score is often handled as one-dimensional sequential data.

Music Performance Rendering

Paper
Code

Deep Content-User Embedding Model for Music Recommendation

1 code implementation • 18 Jul 2018 • Jongpil Lee, Kyungyun Lee, Jiyoung Park, Jang-Yeon Park, Juhan Nam

Recently deep learning based recommendation systems have been actively explored to solve the cold-start problem using a hybrid approach.

Collaborative Filtering Music Auto-Tagging +2

Paper
Code

Revisiting Singing Voice Detection: a Quantitative Review and the Future Outlook

4 code implementations • 4 Jun 2018 • Kyungyun Lee, Keunwoo Choi, Juhan Nam

Since the vocal component plays a crucial role in popular music, singing voice detection has been an active research topic in music information retrieval.

Information Retrieval Music Information Retrieval +1

Paper
Code

Raw Waveform-based Audio Classification Using Sample-level CNN Architectures

no code implementations • 4 Dec 2017 • Jongpil Lee, Taejun Kim, Jiyoung Park, Juhan Nam

Music, speech, and acoustic scene sound are often handled separately in the audio domain because of their different signal characteristics.

Audio Classification General Classification +1

Paper
Add Code

Sample-level CNN Architectures for Music Auto-tagging Using Raw Waveforms

2 code implementations • 28 Oct 2017 • Taejun Kim, Jongpil Lee, Juhan Nam

Recent work has shown that the end-to-end approach using convolutional neural network (CNN) is effective in various types of machine learning tasks.

General Classification Music Auto-Tagging

Paper
Code

Representation Learning of Music Using Artist Labels

2 code implementations • 18 Oct 2017 • Jiyoung Park, Jongpil Lee, Jangyeon Park, Jung-Woo Ha, Juhan Nam

In this paper, we present a supervised feature learning approach using artist labels annotated in every single track as objective meta data.

Sound Audio and Speech Processing

Paper
Code

Multi-Level and Multi-Scale Feature Aggregation Using Sample-level Deep Convolutional Neural Networks for Music Classification

1 code implementation • 21 Jun 2017 • Jongpil Lee, Juhan Nam

Music tag words that describe music audio by text have different levels of abstraction.

Classification General Classification +2

Paper
Code

Sample-level Deep Convolutional Neural Networks for Music Auto-tagging Using Raw Waveforms

3 code implementations • 6 Mar 2017 • Jongpil Lee, Jiyoung Park, Keunhyoung Luke Kim, Juhan Nam

Recently, the end-to-end approach that learns hierarchical representations from raw data using deep convolutional neural networks has been successfully explored in the image, text and speech domains.

Music Auto-Tagging Music Classification

Paper
Code

Multi-Level and Multi-Scale Feature Aggregation Using Pre-trained Convolutional Neural Networks for Music Auto-tagging

1 code implementation • 6 Mar 2017 • Jongpil Lee, Juhan Nam

Second, we extract audio features from each layer of the pre-trained convolutional networks separately and aggregate them altogether given a long audio clip.

General Classification Image Classification +2

Paper
Code

A Deep Bag-of-Features Model for Music Auto-Tagging

1 code implementation • 20 Aug 2015 • Juhan Nam, Jorge Herrera, Kyogu Lee

Feature learning and deep learning have drawn great attention in recent years as a way of transforming input data into more effective representations using learning algorithms.

Audio Classification Information Retrieval +4

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.