Search Results for author: Wenhao Guan

Found 2 papers, 0 papers with code

FastOcc: Accelerating 3D Occupancy Prediction by Fusing the 2D Bird's-Eye View and Perspective View

no code implementations5 Mar 2024 Jiawei Hou, Xiaoyan Li, Wenhao Guan, Gang Zhang, Di Feng, Yuheng Du, xiangyang xue, Jian Pu

In autonomous driving, 3D occupancy prediction outputs voxel-wise status and semantic labels for more comprehensive understandings of 3D scenes compared with traditional perception tasks, such as 3D object detection and bird's-eye view (BEV) semantic segmentation.

3D Object Detection Autonomous Driving +2

MM-TTS: Multi-modal Prompt based Style Transfer for Expressive Text-to-Speech Synthesis

no code implementations17 Dec 2023 Wenhao Guan, Yishuang Li, Tao Li, Hukai Huang, Feng Wang, Jiayan Lin, Lingyan Huang, Lin Li, Qingyang Hong

The challenges of modeling such a multi-modal style controllable TTS mainly lie in two aspects:1)aligning the multi-modal information into a unified style space to enable the input of arbitrary modality as the style prompt in a single system, and 2)efficiently transferring the unified style representation into the given text content, thereby empowering the ability to generate prompt style-related voice.

Speech Synthesis Style Transfer +1

Cannot find the paper you are looking for? You can Submit a new open access paper.