Rendering Music Performance With Interpretation Variations Using Conditional Variational RNN
Capturing and generating a wide variety of musical expression is important in music performance rendering, but current methods fail to model such a variation. This paper presents a music performance rendering method that could explicitly model differences in interpretations for a given piece of music. Conditional variational auto-encoder is used to jointly train, conditioned on the music score, an encoder from performance to a latent code and a decoder from the latent code to music performance. Evaluation demonstrates the method is capable of generating a wide variety of human-like expressive music performances as the latent code is varied.
PDF Abstract