Search Results for author: Jia Jia

Found 31 papers, 7 papers with code

DanceCamera3D: 3D Camera Movement Synthesis with Music and Dance

1 code implementation20 Mar 2024 Zixuan Wang, Jia Jia, Shikun Sun, Haozhe Wu, Rong Han, Zhenyu Li, Di Tang, Jiaqing Zhou, Jiebo Luo

However, camera movement synthesis with music and dance remains an unsolved challenging problem due to the scarcity of paired data.

Siamese Meets Diffusion Network: SMDNet for Enhanced Change Detection in High-Resolution RS Imagery

no code implementations17 Jan 2024 Jia Jia, Geunho Lee, Zhibo Wang, Lyu Zhi, Yuchu He

This network combines the Siam-U2Net Feature Differential Encoder (SU-FDE) and the denoising diffusion implicit model to improve the accuracy of image edge change detection and enhance the model's robustness under environmental changes.

Change Detection Denoising

Grounding-Prompter: Prompting LLM with Multimodal Information for Temporal Sentence Grounding in Long Videos

no code implementations28 Dec 2023 Houlun Chen, Xin Wang, Hong Chen, Zihan Song, Jia Jia, Wenwu Zhu

To tackle these challenges, in this work we propose a Grounding-Prompter method, which is capable of conducting TSG in long videos through prompting LLM with multimodal information.

Denoising In-Context Learning +3

A Discourse-level Multi-scale Prosodic Model for Fine-grained Emotion Analysis

no code implementations21 Sep 2023 Xianhao Wei, Jia Jia, Xiang Li, Zhiyong Wu, Ziyi Wang

More interestingly, although we aim at the synthesis effect of the style transfer model, the synthesized speech by the proposed text prosodic analysis model is even better than the style transfer from the original speech in some user evaluation indicators.

Emotion Recognition Speech Synthesis +1

Semantics2Hands: Transferring Hand Motion Semantics between Avatars

1 code implementation11 Aug 2023 Zijie Ye, Jia Jia, Junliang Xing

Human hands, the primary means of non-verbal communication, convey intricate semantics in various scenarios.

Anatomy motion retargeting

Versatile Face Animator: Driving Arbitrary 3D Facial Avatar in RGBD Space

1 code implementation11 Aug 2023 Haoyu Wang, Haozhe Wu, Junliang Xing, Jia Jia

Creating realistic 3D facial animation is crucial for various applications in the movie production and gaming industry, especially with the burgeoning demand in the metaverse.

motion retargeting Optical Flow Estimation

Speech-Driven 3D Face Animation with Composite and Regional Facial Movements

1 code implementation10 Aug 2023 Haozhe Wu, Songtao Zhou, Jia Jia, Junliang Xing, Qi Wen, Xiang Wen

This paper emphasizes the importance of considering both the composite and regional natures of facial movements in speech-driven 3D face animation.

3D Face Animation

Exploring the Spatiotemporal Features of Online Food Recommendation Service

no code implementations8 Aug 2023 Shaochuan Lin, Jiayan Pei, Taotao Zhou, Hengxu He, Jia Jia, Ning Hu

Online Food Recommendation Service (OFRS) has remarkable spatiotemporal characteristics and the advantage of being able to conveniently satisfy users' needs in a timely manner.

Food recommendation

Multi-Granularity Attention Model for Group Recommendation

no code implementations8 Aug 2023 Jianye Ji, Jiayan Pei, Shaochuan Lin, Taotao Zhou, Hengxu He, Jia Jia, Ning Hu

Group recommendation provides personalized recommendations to a group of users based on their shared interests, preferences, and characteristics.

Mobile Supply: The Last Piece of Jigsaw of Recommender System

no code implementations7 Aug 2023 Zhenhao Jiang, Biao Zeng, Hao Feng, Jin Liu, Jie Zhang, Jia Jia, Ning Hu

In order to address the problem of pagination trigger mechanism, we propose a completely new module in the pipeline of recommender system named Mobile Supply.

Recommendation Systems Re-Ranking

ESMC: Entire Space Multi-Task Model for Post-Click Conversion Rate via Parameter Constraint

no code implementations18 Jul 2023 Zhenhao Jiang, Biao Zeng, Hao Feng, Jin Liu, Jicong Fan, Jie Zhang, Jia Jia, Ning Hu, Xingyu Chen, Xuguang Lan

We propose a novel Entire Space Multi-Task Model for Post-Click Conversion Rate via Parameter Constraint (ESMC) and two alternatives: Entire Space Multi-Task Model with Siamese Network (ESMS) and Entire Space Multi-Task Model in Global Domain (ESMG) to address the PSC issue.

Decision Making Recommendation Systems +1

AvatarFusion: Zero-shot Generation of Clothing-Decoupled 3D Avatars Using 2D Diffusion

no code implementations13 Jul 2023 Shuo Huang, Zongxin Yang, Liangting Li, Yi Yang, Jia Jia

Large-scale pre-trained vision-language models allow for the zero-shot text-based generation of 3D avatars.

Shuffled Autoregression For Motion Interpolation

no code implementations10 Jun 2023 Shuo Huang, Jia Jia, Zongxin Yang, Wei Wang, Haozhe Wu, Yi Yang, Junliang Xing

However, motion interpolation is a more complex problem that takes isolated poses (e. g., only one start pose and one end pose) as input.

Motion Interpolation

BASM: A Bottom-up Adaptive Spatiotemporal Model for Online Food Ordering Service

no code implementations22 Nov 2022 Boya Du, Shaochuan Lin, Jiong Gao, Xiyu Ji, Mengya Wang, Taotao Zhou, Hengxu He, Jia Jia, Ning Hu

Therefore, we address this challenge by proposing a Bottom-up Adaptive Spatiotemporal Model(BASM) to adaptively fit the spatiotemporal data distribution, which further improve the fitting capability of the model.

Recommendation Systems

Spatiotemporal-Enhanced Network for Click-Through Rate Prediction in Location-based Services

no code implementations20 Sep 2022 Shaochuan Lin, Yicong Yu, Xiyu Ji, Taotao Zhou, Hengxu He, Zisen Sang, Jia Jia, Guodong Cao, Ning Hu

In Location-Based Services(LBS), user behavior naturally has a strong dependence on the spatiotemporal information, i. e., in different geographical locations and at different times, user click behavior will change significantly.

Attribute Click-Through Rate Prediction

Towards Cross-speaker Reading Style Transfer on Audiobook Dataset

no code implementations10 Aug 2022 Xiang Li, Changhe Song, Xianhao Wei, Zhiyong Wu, Jia Jia, Helen Meng

This paper aims to introduce a chunk-wise multi-scale cross-speaker style model to capture both the global genre and the local prosody in audiobook speeches.

Style Transfer

Imitating Arbitrary Talking Style for Realistic Audio-DrivenTalking Face Synthesis

1 code implementation30 Oct 2021 Haozhe Wu, Jia Jia, Haoyu Wang, Yishun Dou, Chao Duan, Qingshan Deng

Due to such huge differences between different styles, it is necessary to incorporate the talking style into audio-driven talking face synthesis framework.

Face Generation

Towards Multi-Scale Style Control for Expressive Speech Synthesis

no code implementations8 Apr 2021 Xiang Li, Changhe Song, Jingbei Li, Zhiyong Wu, Jia Jia, Helen Meng

This paper introduces a multi-scale speech style modeling method for end-to-end expressive speech synthesis.

Expressive Speech Synthesis Style Transfer

ChoreoNet: Towards Music to Dance Synthesis with Choreographic Action Unit

no code implementations16 Sep 2020 Zijie Ye, Haozhe Wu, Jia Jia, Yaohua Bu, Wei Chen, Fanbo Meng, Yan-Feng Wang

Meanwhile, human choreographers design dance motions from music in a two-stage manner: they firstly devise multiple choreographic dance units (CAUs), each with a series of dance motions, and then arrange the CAU sequence according to the rhythm, melody and emotion of the music.

Visual-speech Synthesis of Exaggerated Corrective Feedback

no code implementations12 Sep 2020 Yaohua Bu, Weijun Li, Tianyi Ma, Shengqi Chen, Jia Jia, Kun Li, Xiaobo Lu

To provide more discriminative feedback for the second language (L2) learners to better identify their mispronunciation, we propose a method for exaggerated visual-speech feedback in computer-assisted pronunciation training (CAPT).

Speech Synthesis

Speaker Independent and Multilingual/Mixlingual Speech-Driven Talking Head Generation Using Phonetic Posteriorgrams

no code implementations20 Jun 2020 Huirong Huang, Zhiyong Wu, Shiyin Kang, Dongyang Dai, Jia Jia, Tianxiao Fu, Deyi Tuo, Guangzhi Lei, Peng Liu, Dan Su, Dong Yu, Helen Meng

Recent approaches mainly have following limitations: 1) most speaker-independent methods need handcrafted features that are time-consuming to design or unreliable; 2) there is no convincing method to support multilingual or mixlingual speech as input.

Talking Head Generation

Mining Unfollow Behavior in Large-Scale Online Social Networks via Spatial-Temporal Interaction

1 code implementation17 Nov 2019 Haozhe Wu, Zhiyuan Hu, Jia Jia, Yaohua Bu, Xiangnan He, Tat-Seng Chua

Next, we define user's attributes as two categories: spatial attributes (e. g., social role of user) and temporal attributes (e. g., post content of user).

Informativeness

An Online Attention-based Model for Speech Recognition

no code implementations13 Nov 2018 Ruchao Fan, Pan Zhou, Wei Chen, Jia Jia, Gang Liu

In previous work, researchers have shown that such architectures can acquire comparable results to state-of-the-art ASR systems, especially when using a bidirectional encoder and global soft attention (GSA) mechanism.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Exploring RNN-Transducer for Chinese Speech Recognition

no code implementations13 Nov 2018 Senmao Wang, Pan Zhou, Wei Chen, Jia Jia, Lei Xie

End-to-end approaches have drawn much attention recently for significantly simplifying the construction of an automatic speech recognition (ASR) system.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Modality Attention for End-to-End Audio-visual Speech Recognition

no code implementations13 Nov 2018 Pan Zhou, Wenwen Yang, Wei Chen, Yan-Feng Wang, Jia Jia

In this paper, we propose a novel multimodal attention based method for audio-visual speech recognition which could automatically learn the fused representation from both modalities based on their importance.

Audio-Visual Speech Recognition Robust Speech Recognition +2

Cannot find the paper you are looking for? You can Submit a new open access paper.