3D-Speaker is a large-scale speech corpus designed to facilitate the research of speech representation disentanglement. 3DSpeaker contains over 10,000 speakers, each of whom are simultaneously recorded by multiple Devices, locating at different Distances, and some speakers are speaking multiple Dialects. The controlled combinations of multi-dimensional audio data yield a matrix of a diverse blend of speech representations entanglement, thereby motivating intriguing methods to untangle them.
Paper | Code | Results | Date | Stars |
---|