no code implementations • 24 Feb 2024 • Xia Liang, Xingjian Du, Jiaju Lin, Pei Zou, Yuan Wan, Bilei Zhu
Large Language Models (LLM) have shown encouraging progress in multimodal understanding and generation tasks.
no code implementations • 21 Mar 2023 • Xingjian Du, Zijie Wang, Xia Liang, Huidong Liang, Bilei Zhu, Zejun Ma
Deep learning based methods have become a paradigm for cover song identification (CSI) in recent years, where the ByteCover systems have achieved state-of-the-art results on all the mainstream datasets of CSI.
1 code implementation • 8 Sep 2019 • Xia Liang, Junmin Wu, Jing Cao
In view of the above problem, this paper proposes a RNN-based Hierarchical Multi-modal Fusion Generation Variational Autoencoder (VAE) network, MIDI-Sandwich2, for multi-track symbolic music generation.
no code implementations • 2 Jul 2019 • Xia Liang, Junmin Wu, Yan Yin
The upper layer of HCVAE uses Global Variational Autoencoder(G-VAE) to analyze the latent vector sequence generated by the L-CVAE encoder, to explore the musical relationship between the bars, and to produce the song pieced together by multiple music bars generated by the L-CVAE decoder, which makes the song both have musical structure and sense of direction.