Children's Song Dataset

Children's Song Dataset is open source dataset for singing voice research. This dataset contains 50 Korean and 50 English songs sung by one Korean female professional pop singer. Each song is recorded in two separate keys resulting in a total of 200 audio recordings. Each audio recording is paired with a MIDI transcription and lyrics annotations in both grapheme-level and phoneme-level.

Dataset Structure

The entire data splits into Korean and English and each language splits into 'wav', 'mid', 'lyric', 'txt' and 'csv' folders. Each song has the identical file name for each format. Each format represents following information. Additional information like original song name, tempo and time signature for each song can be found in 'metadata.json'.

  • 'wav': Vocal recordings in 44.1kHz 16bit wav format
  • 'mid': Score information in MIDI format
  • 'lyric': Lyric information in grapheme-level
  • 'txt': Lyric information in syllable and phoneme-level
  • 'csv': Note onsets and offsets and syllable timings in comma-separated value (CSV) format

Papers


Paper Code Results Date Stars

Dataset Loaders


No data loaders found. You can submit your data loader here.

Tasks


License


  • NonCommercial-ShareAlike 4.0 International

Modalities


Languages