Search Results for author: Xiang Yin

Found 65 papers, 17 papers with code

Optimal Control Synthesis of Markov Decision Processes for Efficiency with Surveillance Tasks

no code implementations • 27 Mar 2024 • Yu Chen, Xuanyuan Yin, ShaoYuan Li, Xiang Yin

Our objective is to synthesize a control policy that ensures the surveillance task while maximizes the efficiency.

Paper
Add Code

Prioritize Team Actions: Multi-Agent Temporal Logic Task Planning with Ordering Constraints

no code implementations • 26 Mar 2024 • Bowen Ye, Jianing Zhao, ShaoYuan Li, Xiang Yin

Simultaneously, we aim to maintain a pre-determined order in the values of the objective function for each agent, which we refer to as the ordering constraints.

Paper
Add Code

Formal Synthesis of Controllers for Safety-Critical Autonomous Systems: Developments and Challenges

no code implementations • 20 Feb 2024 • Xiang Yin, Bingzhao Gao, Xiao Yu

This paper provides a comprehensive review of formal controller synthesis techniques for safety-critical autonomous systems.

Paper
Add Code

A Game-Theoretical Approach for Optimal Supervisory Control of Discrete Event Systems under Energy Constraints

no code implementations • 8 Feb 2024 • Peng Lv, ShaoYuan Li, Xiang Yin

To solve this problem, we propose a game-theoretical approach by converting the cDES as a consumption two-player graph game (cTPG) and reformulate the optimal supervisory control problem in game theory.

Paper
Add Code

Learning Local Control Barrier Functions for Safety Control of Hybrid Systems

1 code implementation • 26 Jan 2024 • Shuo Yang, Yu Chen, Xiang Yin, Rahul Mangharam

Our approach is computationally efficient, minimally invasive to any reference controller, and applicable to large-scale systems.

Model Predictive Control

Paper
Code

Real3D-Portrait: One-shot Realistic 3D Talking Portrait Synthesis

1 code implementation • 16 Jan 2024 • Zhenhui Ye, Tianyun Zhong, Yi Ren, Jiaqi Yang, Weichuang Li, Jiawei Huang, Ziyue Jiang, Jinzheng He, Rongjie Huang, Jinglin Liu, Chen Zhang, Xiang Yin, Zejun Ma, Zhou Zhao

One-shot 3D talking portrait generation aims to reconstruct a 3D avatar from an unseen image, and then animate it with a reference video or audio to generate a talking portrait video.

3D Reconstruction Super-Resolution +1

564

Paper
Code

Contribution Functions for Quantitative Bipolar Argumentation Graphs: A Principle-based Analysis

no code implementations • 16 Jan 2024 • Timotheus Kampik, Nico Potyka, Xiang Yin, Kristijonas Čyras, Francesca Toni

We present a principle-based analysis of contribution functions for quantitative bipolar argumentation graphs that quantify the contribution of one argument to another.

Paper
Add Code

On Approximate Opacity of Stochastic Control Systems

no code implementations • 3 Jan 2024 • Siyuan Liu, Xiang Yin, Dimos V. Dimarogonas, Majid Zamani

Based on this new system relation, we show that one can verify opacity for stochastic control systems using their abstractions (modeled as finite gMDPs).

Relation

Paper
Add Code

Emotion Rendering for Conversational Speech Synthesis with Heterogeneous Graph-Based Context Modeling

1 code implementation • 19 Dec 2023 • Rui Liu, Yifan Hu, Yi Ren, Xiang Yin, Haizhou Li

Conversational Speech Synthesis (CSS) aims to accurately express an utterance with the appropriate prosody and emotional inflection within a conversational setting.

Contrastive Learning Speech Synthesis

Paper
Code

Synthesis of Temporally-Robust Policies for Signal Temporal Logic Tasks using Reinforcement Learning

1 code implementation • 10 Dec 2023 • Siqi Wang, ShaoYuan Li, Li Yin, Xiang Yin

The second objective is to maximize the worst-case spatial robustness value within a bounded time shift.

Q-Learning

Paper
Code

Signal Temporal Logic Control Synthesis among Uncontrollable Dynamic Agents with Conformal Prediction

1 code implementation • 7 Dec 2023 • Xinyi Yu, Yiqi Zhao, Xiang Yin, Lars Lindemann

We propose a predictive control synthesis framework that guarantees, with high probability, the satisfaction of signal temporal logic (STL) tasks that are defined over the system and uncontrollable stochastic agents.

Conformal Prediction valid

Paper
Code

Sleep When Everything Looks Fine: Self-Triggered Monitoring for Signal Temporal Logic Tasks

1 code implementation • 27 Nov 2023 • Chuwei Wang, Xinyi Yu, Jianing Zhao, Lars Lindemann, Xiang Yin

Existing works on online monitoring usually assume that the monitor can acquire system information periodically at each time instant.

Paper
Code

SoybeanNet: Transformer-Based Convolutional Neural Network for Soybean Pod Counting from Unmanned Aerial Vehicle (UAV) Images

1 code implementation • 16 Oct 2023 • Jiajia Li, Raju Thada Magar, Dong Chen, Feng Lin, Dechun Wang, Xiang Yin, Weichao Zhuang, Zhaojian Li

Soybeans are a critical source of food, protein and oil, and thus have received extensive research aimed at enhancing their yield, refining cultivation practices, and advancing soybean breeding techniques.

Paper
Code

Safe-by-Construction Autonomous Vehicle Overtaking using Control Barrier Functions and Model Predictive Control

no code implementations • 10 Oct 2023 • Dingran Yuan, Xinyi Yu, ShaoYuan Li, Xiang Yin

In order to tackle the overtaking task in such challenging scenarios, we introduce a novel integrated framework tailored for vehicle overtaking maneuvers.

Autonomous Driving Model Predictive Control

Paper
Add Code

NNgTL: Neural Network Guided Optimal Temporal Logic Task Planning for Mobile Robots

no code implementations • 25 Sep 2023 • Ruijia Liu, ShaoYuan Li, Xiang Yin

In this work, we investigate task planning for mobile robots under linear temporal logic (LTL) specifications.

Navigate

Paper
Add Code

C2G2: Controllable Co-speech Gesture Generation with Latent Diffusion Model

no code implementations • 29 Aug 2023 • Longbin Ji, Pengfei Wei, Yi Ren, Jinglin Liu, Chen Zhang, Xiang Yin

Co-speech gesture generation is crucial for automatic digital avatar animation.

Gesture Generation

Paper
Add Code

Argument Attribution Explanations in Quantitative Bipolar Argumentation Frameworks (Technical Report)

no code implementations • 25 Jul 2023 • Xiang Yin, Nico Potyka, Francesca Toni

Argumentative explainable AI has been advocated by several in recent years, with an increasing interest on explaining the reasoning outcomes of Argumentation Frameworks (AFs).

Fake News Detection Recommendation Systems

Paper
Add Code

Efficient STL Control Synthesis under Asynchronous Temporal Robustness Constraints

no code implementations • 24 Jul 2023 • Xinyi Yu, Xiang Yin, Lars Lindemann

Given an ATR bound, we compute a sequence of control inputs so that the specification is satisfied by the system as long as each sub-trajectory is shifted not more than the ATR bound.

Paper
Add Code

Mega-TTS 2: Boosting Prompting Mechanisms for Zero-Shot Speech Synthesis

no code implementations • 14 Jul 2023 • Ziyue Jiang, Jinglin Liu, Yi Ren, Jinzheng He, Zhenhui Ye, Shengpeng Ji, Qian Yang, Chen Zhang, Pengfei Wei, Chunfeng Wang, Xiang Yin, Zejun Ma, Zhou Zhao

However, the prompting mechanisms of zero-shot TTS still face challenges in the following aspects: 1) previous works of zero-shot TTS are typically trained with single-sentence prompts, which significantly restricts their performance when the data is relatively sufficient during the inference stage.

In-Context Learning Language Modelling +3

Paper
Add Code

GenerTTS: Pronunciation Disentanglement for Timbre and Style Generalization in Cross-Lingual Text-to-Speech

no code implementations • 27 Jun 2023 • Yahuan Cong, Haoyu Zhang, Haopeng Lin, Shichao Liu, Chunfeng Wang, Yi Ren, Xiang Yin, Zejun Ma

Cross-lingual timbre and style generalizable text-to-speech (TTS) aims to synthesize speech with a specific reference timbre or style that is never trained in the target language.

Disentanglement Style Generalization

Paper
Add Code

Towards Building Voice-based Conversational Recommender Systems: Datasets, Potential Solutions, and Prospects

1 code implementation • 14 Jun 2023 • Xinghua Qu, Hongyang Liu, Zhu Sun, Xiang Yin, Yew Soon Ong, Lu Lu, Zejun Ma

Conversational recommender systems (CRSs) have become crucial emerging research topics in the field of RSs, thanks to their natural advantages of explicitly acquiring user preferences via interactive conversations and revealing the reasons behind recommendations.

Recommendation Systems

Paper
Code

Ada-TTA: Towards Adaptive High-Quality Text-to-Talking Avatar Synthesis

no code implementations • 6 Jun 2023 • Zhenhui Ye, Ziyue Jiang, Yi Ren, Jinglin Liu, Chen Zhang, Xiang Yin, Zejun Ma, Zhou Zhao

We are interested in a novel task, namely low-resource text-to-talking avatar.

Neural Rendering Video Generation +1

Paper
Add Code

Mega-TTS: Zero-Shot Text-to-Speech at Scale with Intrinsic Inductive Bias

no code implementations • 6 Jun 2023 • Ziyue Jiang, Yi Ren, Zhenhui Ye, Jinglin Liu, Chen Zhang, Qian Yang, Shengpeng Ji, Rongjie Huang, Chunfeng Wang, Xiang Yin, Zejun Ma, Zhou Zhao

3) We further use a VQGAN-based acoustic model to generate the spectrogram and a latent code language model to fit the distribution of prosody, since prosody changes quickly over time in a sentence, and language models can capture both local and long-range dependencies.

Attribute Inductive Bias +3

Paper
Add Code

Detector Guidance for Multi-Object Text-to-Image Generation

1 code implementation • 4 Jun 2023 • Luping Liu, Zijian Zhang, Yi Ren, Rongjie Huang, Xiang Yin, Zhou Zhao

Previous works identify the problem of information mixing in the CLIP text encoder and introduce the T5 text encoder or incorporate strong prior knowledge to assist with the alignment.

Object object-detection +2

Paper
Code

Make-An-Audio 2: Temporal-Enhanced Text-to-Audio Generation

no code implementations • 29 May 2023 • Jiawei Huang, Yi Ren, Rongjie Huang, Dongchao Yang, Zhenhui Ye, Chen Zhang, Jinglin Liu, Xiang Yin, Zejun Ma, Zhou Zhao

Finally, we use LLMs to augment and transform a large amount of audio-label data into audio-text datasets to alleviate the problem of scarcity of temporal data.

Ranked #7 on Audio Generation on AudioCaps

Audio Generation Denoising +2

Paper
Add Code

AV-TranSpeech: Audio-Visual Robust Speech-to-Speech Translation

no code implementations • 24 May 2023 • Rongjie Huang, Huadai Liu, Xize Cheng, Yi Ren, Linjun Li, Zhenhui Ye, Jinzheng He, Lichao Zhang, Jinglin Liu, Xiang Yin, Zhou Zhao

Direct speech-to-speech translation (S2ST) aims to convert speech from one language into another, and has demonstrated significant progress to date.

Speech-to-Speech Translation Translation

Paper
Add Code

GeneFace++: Generalized and Stable Real-Time Audio-Driven 3D Talking Face Generation

no code implementations • 1 May 2023 • Zhenhui Ye, Jinzheng He, Ziyue Jiang, Rongjie Huang, Jiawei Huang, Jinglin Liu, Yi Ren, Xiang Yin, Zejun Ma, Zhou Zhao

Recently, neural radiance field (NeRF) has become a popular rendering technique in this field since it could achieve high-fidelity and 3D-consistent talking face generation with a few-minute-long training video.

motion prediction Talking Face Generation

Paper
Add Code

Data-Driven Safe Controller Synthesis for Deterministic Systems: A Posteriori Method With Validation Tests

no code implementations • 3 Apr 2023 • Yu Chen, Chao Shang, Xiaolin Huang, Xiang Yin

We first formulate the safety synthesis problem as a robust convex program (RCP) based on notion of control barrier function.

Paper
Add Code

LiteG2P: A fast, light and high accuracy model for grapheme-to-phoneme conversion

no code implementations • 2 Mar 2023 • Chunfeng Wang, Peisong Huang, Yuxiang Zou, Haoyu Zhang, Shichao Liu, Xiang Yin, Zejun Ma

As a key component of automated speech recognition (ASR) and the front-end in text-to-speech (TTS), grapheme-to-phoneme (G2P) plays the role of converting letters to their corresponding pronunciations.

speech-recognition Speech Recognition

Paper
Add Code

Make-An-Audio: Text-To-Audio Generation with Prompt-Enhanced Diffusion Models

1 code implementation • 30 Jan 2023 • Rongjie Huang, Jiawei Huang, Dongchao Yang, Yi Ren, Luping Liu, Mingze Li, Zhenhui Ye, Jinglin Liu, Xiang Yin, Zhou Zhao

Its application to audio still lags behind for two main reasons: the lack of large-scale datasets with high-quality text-audio pairs, and the complexity of modeling long continuous audio data.

Ranked #11 on Audio Generation on AudioCaps

Audio Generation Text-to-Video Generation +1

685

Paper
Code

Virtual Try-On with Pose-Garment Keypoints Guided Inpainting

1 code implementation • ICCV 2023 • Zhi Li, Pengfei Wei, Xiang Yin, Zejun Ma, Alex C. Kot

In our method, human pose and garment keypoints are extracted from source images and constructed as graphs to predict the garment keypoints at the target pose.

Virtual Try-on

Paper
Code

Direct Speech-to-speech Translation without Textual Annotation using Bottleneck Features

no code implementations • 12 Dec 2022 • Junhui Zhang, Junjie Pan, Xiang Yin, Zejun Ma

Speech-to-speech translation directly translates a speech utterance to another between different languages, and has great potential in tasks such as simultaneous interpretation.

Speech-to-Speech Translation Translation

Paper
Add Code

You Don't Know When I Will Arrive: Unpredictable Controller Synthesis for Temporal Logic Tasks

no code implementations • 23 Nov 2022 • Yu Chen, Shuo Yang, Rahul Mangharam, Xiang Yin

This problem is particularly challenging since future information is involved in the synthesis process.

Robot Task Planning

Paper
Add Code

Markov decision processes with maximum entropy rate for Surveillance Tasks

no code implementations • 23 Nov 2022 • Yu Chen, ShaoYuan Li, Xiang Yin

We consider the problem of synthesizing optimal policies for Markov decision processes (MDP) for both utility objective and security constraint.

Paper
Add Code

Explaining Random Forests using Bipolar Argumentation and Markov Networks (Technical Report)

no code implementations • 21 Nov 2022 • Nico Potyka, Xiang Yin, Francesca Toni

Random forests are decision tree ensembles that can be used to solve a variety of machine learning problems.

Decision Making

Paper
Add Code

Model Predictive Control for Signal Temporal Logic Specifications with Time Interval Decomposition

1 code implementation • 15 Nov 2022 • Xinyi Yu, Chuwei Wang, Dingran Yuan, ShaoYuan Li, Xiang Yin

However, instead of applying MPC directly for the entire task horizon, we decompose the STL formula into several sub-formulae with disjoint time horizons, and shrinking horizon MPC is applied for each short-horizon sub-formula iteratively.

Computational Efficiency Model Predictive Control

Paper
Code

Abstraction-Based Verification of Approximate Pre-Opacity for Control Systems

no code implementations • 8 Nov 2022 • Junyao Hou, Siyuan Liu, Xiang Yin, Majid Zamani

In this paper, we first introduce a concept of approximate pre-opacity by capturing the security level of control systems with respect to the measurement precision of the intruder.

Paper
Add Code

Model Predictive Monitoring of Dynamical Systems for Signal Temporal Logic Specifications

1 code implementation • 26 Sep 2022 • Xinyi Yu, Weijie Dong, Xiang Yin, ShaoYuan Li

To this end, effective approaches for the computation of feasible sets of STL formulae are provided.

Paper
Code

Unsupervised Video Domain Adaptation for Action Recognition: A Disentanglement Perspective

1 code implementation • NeurIPS 2023 • Pengfei Wei, Lingdong Kong, Xinghua Qu, Yi Ren, Zhiqiang Xu, Jing Jiang, Xiang Yin

Specifically, we consider the generation of cross-domain videos from two sets of latent factors, one encoding the static information and another encoding the dynamic information.

Action Recognition Disentanglement +1

119

Paper
Code

A Novel Chinese Dialect TTS Frontend with Non-Autoregressive Neural Machine Translation

no code implementations • 10 Jun 2022 • Junhui Zhang, Wudi Bao, Junjie Pan, Xiang Yin, Zejun Ma

In this paper, we propose a novel Chinese dialect TTS frontend with a translation module, which converts Mandarin text into dialectic expressions to improve the intelligibility and naturalness of synthesized speech.

Machine Translation Translation

Paper
Add Code

Towards a Theory of Faithfulness: Faithful Explanations of Differentiable Classifiers over Continuous Data

no code implementations • 19 May 2022 • Nico Potyka, Xiang Yin, Francesca Toni

There is broad agreement in the literature that explanation methods should be faithful to the model that they explain, but faithfulness remains a rather vague term.

Paper
Add Code

A Unified Framework for Verification of Observational Properties for Partially-Observed Discrete-Event Systems

no code implementations • 3 May 2022 • Jianing Zhao, Xiang Yin, ShaoYuan Li

However, in contrast to existing results, where different verification procedures are developed for different properties case-by-case, in this work, we provide a unified framework for verifying all these properties by reducing each of them as an instance of HyperLTL model checking.

Paper
Add Code

A Uniform Framework for Diagnosis of Discrete-Event Systems with Unreliable Sensors using Linear Temporal Logic

no code implementations • 27 Apr 2022 • Weijie Dong, Xiang Yin, ShaoYuan Li

In this work, we propose a novel \emph{uniform framework} for diagnosability of DES subject to, not only sensor failures, but also a very general class of unreliable sensors.

Paper
Add Code

Fault Diagnosis of Discrete-Event Systems under Non-Deterministic Observations with Output Fairness

no code implementations • 6 Apr 2022 • Weijie Dong, Shang Gao, Xiang Yin, ShaoYuan Li

Non-deterministic observation is a general observation model that includes the case of intermittent loss of observations.

Fairness

Paper
Add Code

To Explore or Not to Explore: Regret-Based LTL Planning in Partially-Known Environments

no code implementations • 1 Apr 2022 • Jianing Zhao, Keyi Zhu, Mingyang Feng, Xiang Yin

In contrast to the standard game-based approach that optimizes the worst-case cost, in the paper, we propose to use regret as a new metric for planning in such a partially-known environment.

Paper
Add Code

You Don't Know What I Know: On Notion of High-Order Opacity in Discrete-Event Systems

no code implementations • 31 Mar 2022 • Bohan Cui, Xiang Yin, ShaoYuan Li, Alessandro Giua

In this paper, we consider a new type of secret related to the knowledge of the system user.

Paper
Add Code

Sensor Deception Attacks Against Initial-State Privacy in Supervisory Control Systems

no code implementations • 31 Mar 2022 • Jingshi Yao, Xiang Yin, ShaoYuan Li

Specifically, we consider an active attacker that can tamper with the observations received by the supervisor by, e. g., hacking on the communication channel between the sensors and the supervisor.

Paper
Add Code

Online Monitoring of Dynamic Systems for Signal Temporal Logic Specifications with Model Information

no code implementations • 30 Mar 2022 • Xinyi Yu, Weijie Dong, Xiang Yin, ShaoYuan Li

We show that, by explicitly utilizing the model information of the dynamic system, the proposed online monitoring algorithm can falsify or certify of the specification in advance compared with existing algorithms, where no model information is used.

Paper
Add Code

Towards Realistic Visual Dubbing with Heterogeneous Sources

no code implementations • 17 Jan 2022 • Tianyi Xie, Liucheng Liao, Cheng Bi, Benlai Tang, Xiang Yin, Jianfei Yang, Mingjie Wang, Jiali Yao, Yang Zhang, Zejun Ma

The task of few-shot visual dubbing focuses on synchronizing the lip movements with arbitrary speech input for any talking head video.

Disentanglement Talking Head Generation

Paper
Add Code

Data privacy protection in microscopic image analysis for material data mining

no code implementations • 9 Nov 2021 • Boyuan Ma, Xiang Yin, Xiaojuan Ban, Haiyou Huang, Neng Zhang, Hao Wang, Weihua Xue

The core contributions are as follows: 1) the federated learning algorithm is introduced into the polycrystalline microstructure image segmentation task to make full use of different user data to carry out machine learning, break the data island and improve the model generalization ability under the condition of ensuring the privacy and security of user data; 2) A data sharing strategy based on style transfer is proposed.

Federated Learning Image Segmentation +2

Paper
Add Code

Towards Using Clothes Style Transfer for Scenario-aware Person Video Generation

1 code implementation • 14 Oct 2021 • Jingning Xu, Benlai Tang, Mingjie Wang, Siyuan Bian, Wenyi Guo, Xiang Yin, Zejun Ma

To tackle this problem, most recent AdaIN-based architectures are proposed to extract clothes and scenario features for generation.

Style Transfer Video Generation

Paper
Code

Towards High-fidelity Singing Voice Conversion with Acoustic Reference and Contrastive Predictive Coding

no code implementations • 10 Oct 2021 • Chao Wang, Zhonghao Li, Benlai Tang, Xiang Yin, Yuan Wan, Yibiao Yu, Zejun Ma

Experiments show that, compared with the baseline models, our proposed model can significantly improve the naturalness of converted singing voices and the similarity with the target singer.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Cross-speaker Emotion Transfer Based on Speaker Condition Layer Normalization and Semi-Supervised Training in Text-To-Speech

1 code implementation • 8 Oct 2021 • Pengfei Wu, Junjie Pan, Chenchang Xu, Junhui Zhang, Lin Wu, Xiang Yin, Zejun Ma

In expressive speech synthesis, there are high requirements for emotion interpretation.

Expressive Speech Synthesis

170

Paper
Code

Personalized Heterogeneous Federated Learning with Gradient Similarity

no code implementations • 29 Sep 2021 • Jing Xie, Xiang Yin, Xiyi Zhang, Juan Chen, Quan Wen, Qiang Yang, Xuan Mo

In SPFL, the server uses the Softmax Normalized Gradient Similarity (SNGS) to weight the relationship between clients, and sends the personalized global model to each client.

Federated Learning

Paper
Add Code

Cloud-Assisted Nonlinear Model Predictive Control for Finite-Duration Tasks

no code implementations • 20 Jun 2021 • Nan Li, Kaixiang Zhang, Zhaojian Li, Vaibhav Srivastava, Xiang Yin

In this paper, we propose a novel cloud-assisted model predictive control (MPC) framework in which we systematically fuse a cloud MPC that uses a high-fidelity nonlinear model but is subject to communication delays with a local MPC that exploits simplified dynamics (due to limited computation) but has timely feedback.

Cloud Computing Model Predictive Control

Paper
Add Code

Optimal Synthesis of Opacity-Enforcing Supervisors for Qualitative and Quantitative Specifications

no code implementations • 2 Feb 2021 • Yifan Xie, Xiang Yin, ShaoYuan Li

We assume that the system has a "secret" that does not want to be revealed to the intruder.

Paper
Add Code

A Framework for Current-State Opacity under Dynamic Information Release Mechanism

no code implementations • 9 Dec 2020 • Junyao Hou, Xiang Yin, ShaoYuan Li

In this paper, we investigate the verification of current-state opacity for discrete-event systems under Orwellian-type observations, i. e., the system is allowed to re-interpret the observation of an event based on its future suffix.

Paper
Add Code

PPG-based singing voice conversion with adversarial representation learning

no code implementations • 28 Oct 2020 • Zhonghao Li, Benlai Tang, Xiang Yin, Yuan Wan, Ling Xu, Chen Shen, Zejun Ma

Singing voice conversion (SVC) aims to convert the voice of one singer to that of other singers while keeping the singing content and melody.

Representation Learning Voice Conversion +1

Paper
Add Code

Opacity Enforcing Supervisory Control using Non-deterministic Supervisors

no code implementations • 20 Oct 2020 • Yifan Xie, Xiang Yin, ShaoYuan Li

Compared with the standard deterministic control mechanism, such a non-deterministic control mechanism can enhance the plausible deniability of the controlled system as the online control decision is a random realization and cannot be implicitly inferred from the control policy.

Paper
Add Code

End-to-End Learning for Simultaneously Generating Decision Map and Multi-Focus Image Fusion Result

2 code implementations • 17 Oct 2020 • Boyuan Ma, Xiang Yin, Di wu, Xiaojuan Ban

In this work, to handle the requirements of both output image quality and comprehensive simplicity of structure implementation, we propose a cascade network to simultaneously generate decision map and fused result with an end-to-end training procedure.

2D Cyclist Detection

Paper
Code

Xiaomingbot: A Multilingual Robot News Reporter

no code implementations • ACL 2020 • Runxin Xu, Jun Cao, Mingxuan Wang, Jiaze Chen, Hao Zhou, Ying Zeng, Yu-Ping Wang, Li Chen, Xiang Yin, Xijin Zhang, Songcheng Jiang, Yuxuan Wang, Lei LI

This paper proposes the building of Xiaomingbot, an intelligent, multilingual and multimodal software robot equipped with four integral capabilities: news generation, news translation, news reading and avatar animation.

News Generation Translation +1

Paper
Add Code

Improving Accent Conversion with Reference Encoder and End-To-End Text-To-Speech

no code implementations • 19 May 2020 • Wenjie Li, Benlai Tang, Xiang Yin, Yushi Zhao, Wei Li, Kang Wang, Hao Huang, Yuxuan Wang, Zejun Ma

Accent conversion (AC) transforms a non-native speaker's accent into a native accent while maintaining the speaker's voice timbre.

Paper
Add Code

ByteSing: A Chinese Singing Voice Synthesis System Using Duration Allocated Encoder-Decoder Acoustic Models and WaveRNN Vocoders

no code implementations • 23 Apr 2020 • Yu Gu, Xiang Yin, Yonghui Rao, Yuan Wan, Benlai Tang, Yang Zhang, Jitong Chen, Yuxuan Wang, Zejun Ma

This paper presents ByteSing, a Chinese singing voice synthesis (SVS) system based on duration allocated Tacotron-like acoustic models and WaveRNN neural vocoders.

Singing Voice Synthesis

Paper
Add Code

A unified sequence-to-sequence front-end model for Mandarin text-to-speech synthesis

no code implementations • 11 Nov 2019 • Junjie Pan, Xiang Yin, Zhiling Zhang, Shichao Liu, Yang Zhang, Zejun Ma, Yuxuan Wang

In Mandarin text-to-speech (TTS) system, the front-end text processing module significantly influences the intelligibility and naturalness of synthesized speech.

Polyphone disambiguation Speech Synthesis +1

Paper
Add Code

A hybrid text normalization system using multi-head self-attention for mandarin

no code implementations • 11 Nov 2019 • Junhui Zhang, Junjie Pan, Xiang Yin, Chen Li, Shichao Liu, Yang Zhang, Yuxuan Wang, Zejun Ma

In this paper, we propose a hybrid text normalization system using multi-head self-attention.

Sentence

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.