Search Results for author: Xiang Yin

Found 65 papers, 17 papers with code

Optimal Control Synthesis of Markov Decision Processes for Efficiency with Surveillance Tasks

no code implementations27 Mar 2024 Yu Chen, Xuanyuan Yin, ShaoYuan Li, Xiang Yin

Our objective is to synthesize a control policy that ensures the surveillance task while maximizes the efficiency.

Motion Planning

Prioritize Team Actions: Multi-Agent Temporal Logic Task Planning with Ordering Constraints

no code implementations26 Mar 2024 Bowen Ye, Jianing Zhao, ShaoYuan Li, Xiang Yin

Simultaneously, we aim to maintain a pre-determined order in the values of the objective function for each agent, which we refer to as the ordering constraints.

Formal Synthesis of Controllers for Safety-Critical Autonomous Systems: Developments and Challenges

no code implementations20 Feb 2024 Xiang Yin, Bingzhao Gao, Xiao Yu

This paper provides a comprehensive review of formal controller synthesis techniques for safety-critical autonomous systems.

A Game-Theoretical Approach for Optimal Supervisory Control of Discrete Event Systems under Energy Constraints

no code implementations8 Feb 2024 Peng Lv, ShaoYuan Li, Xiang Yin

To solve this problem, we propose a game-theoretical approach by converting the cDES as a consumption two-player graph game (cTPG) and reformulate the optimal supervisory control problem in game theory.

Learning Local Control Barrier Functions for Safety Control of Hybrid Systems

1 code implementation26 Jan 2024 Shuo Yang, Yu Chen, Xiang Yin, Rahul Mangharam

Our approach is computationally efficient, minimally invasive to any reference controller, and applicable to large-scale systems.

Model Predictive Control

Real3D-Portrait: One-shot Realistic 3D Talking Portrait Synthesis

1 code implementation16 Jan 2024 Zhenhui Ye, Tianyun Zhong, Yi Ren, Jiaqi Yang, Weichuang Li, Jiawei Huang, Ziyue Jiang, Jinzheng He, Rongjie Huang, Jinglin Liu, Chen Zhang, Xiang Yin, Zejun Ma, Zhou Zhao

One-shot 3D talking portrait generation aims to reconstruct a 3D avatar from an unseen image, and then animate it with a reference video or audio to generate a talking portrait video.

3D Reconstruction Super-Resolution +1

Contribution Functions for Quantitative Bipolar Argumentation Graphs: A Principle-based Analysis

no code implementations16 Jan 2024 Timotheus Kampik, Nico Potyka, Xiang Yin, Kristijonas Čyras, Francesca Toni

We present a principle-based analysis of contribution functions for quantitative bipolar argumentation graphs that quantify the contribution of one argument to another.

On Approximate Opacity of Stochastic Control Systems

no code implementations3 Jan 2024 Siyuan Liu, Xiang Yin, Dimos V. Dimarogonas, Majid Zamani

Based on this new system relation, we show that one can verify opacity for stochastic control systems using their abstractions (modeled as finite gMDPs).

Relation

Emotion Rendering for Conversational Speech Synthesis with Heterogeneous Graph-Based Context Modeling

1 code implementation19 Dec 2023 Rui Liu, Yifan Hu, Yi Ren, Xiang Yin, Haizhou Li

Conversational Speech Synthesis (CSS) aims to accurately express an utterance with the appropriate prosody and emotional inflection within a conversational setting.

Contrastive Learning Speech Synthesis

Synthesis of Temporally-Robust Policies for Signal Temporal Logic Tasks using Reinforcement Learning

1 code implementation10 Dec 2023 Siqi Wang, ShaoYuan Li, Li Yin, Xiang Yin

The second objective is to maximize the worst-case spatial robustness value within a bounded time shift.

Q-Learning

Signal Temporal Logic Control Synthesis among Uncontrollable Dynamic Agents with Conformal Prediction

1 code implementation7 Dec 2023 Xinyi Yu, Yiqi Zhao, Xiang Yin, Lars Lindemann

We propose a predictive control synthesis framework that guarantees, with high probability, the satisfaction of signal temporal logic (STL) tasks that are defined over the system and uncontrollable stochastic agents.

Conformal Prediction valid

Sleep When Everything Looks Fine: Self-Triggered Monitoring for Signal Temporal Logic Tasks

1 code implementation27 Nov 2023 Chuwei Wang, Xinyi Yu, Jianing Zhao, Lars Lindemann, Xiang Yin

Existing works on online monitoring usually assume that the monitor can acquire system information periodically at each time instant.

SoybeanNet: Transformer-Based Convolutional Neural Network for Soybean Pod Counting from Unmanned Aerial Vehicle (UAV) Images

1 code implementation16 Oct 2023 Jiajia Li, Raju Thada Magar, Dong Chen, Feng Lin, Dechun Wang, Xiang Yin, Weichao Zhuang, Zhaojian Li

Soybeans are a critical source of food, protein and oil, and thus have received extensive research aimed at enhancing their yield, refining cultivation practices, and advancing soybean breeding techniques.

Safe-by-Construction Autonomous Vehicle Overtaking using Control Barrier Functions and Model Predictive Control

no code implementations10 Oct 2023 Dingran Yuan, Xinyi Yu, ShaoYuan Li, Xiang Yin

In order to tackle the overtaking task in such challenging scenarios, we introduce a novel integrated framework tailored for vehicle overtaking maneuvers.

Autonomous Driving Model Predictive Control

NNgTL: Neural Network Guided Optimal Temporal Logic Task Planning for Mobile Robots

no code implementations25 Sep 2023 Ruijia Liu, ShaoYuan Li, Xiang Yin

In this work, we investigate task planning for mobile robots under linear temporal logic (LTL) specifications.

Navigate

Argument Attribution Explanations in Quantitative Bipolar Argumentation Frameworks (Technical Report)

no code implementations25 Jul 2023 Xiang Yin, Nico Potyka, Francesca Toni

Argumentative explainable AI has been advocated by several in recent years, with an increasing interest on explaining the reasoning outcomes of Argumentation Frameworks (AFs).

Fake News Detection Recommendation Systems

Efficient STL Control Synthesis under Asynchronous Temporal Robustness Constraints

no code implementations24 Jul 2023 Xinyi Yu, Xiang Yin, Lars Lindemann

Given an ATR bound, we compute a sequence of control inputs so that the specification is satisfied by the system as long as each sub-trajectory is shifted not more than the ATR bound.

Mega-TTS 2: Boosting Prompting Mechanisms for Zero-Shot Speech Synthesis

no code implementations14 Jul 2023 Ziyue Jiang, Jinglin Liu, Yi Ren, Jinzheng He, Zhenhui Ye, Shengpeng Ji, Qian Yang, Chen Zhang, Pengfei Wei, Chunfeng Wang, Xiang Yin, Zejun Ma, Zhou Zhao

However, the prompting mechanisms of zero-shot TTS still face challenges in the following aspects: 1) previous works of zero-shot TTS are typically trained with single-sentence prompts, which significantly restricts their performance when the data is relatively sufficient during the inference stage.

In-Context Learning Language Modelling +3

GenerTTS: Pronunciation Disentanglement for Timbre and Style Generalization in Cross-Lingual Text-to-Speech

no code implementations27 Jun 2023 Yahuan Cong, Haoyu Zhang, Haopeng Lin, Shichao Liu, Chunfeng Wang, Yi Ren, Xiang Yin, Zejun Ma

Cross-lingual timbre and style generalizable text-to-speech (TTS) aims to synthesize speech with a specific reference timbre or style that is never trained in the target language.

Disentanglement Style Generalization

Towards Building Voice-based Conversational Recommender Systems: Datasets, Potential Solutions, and Prospects

1 code implementation14 Jun 2023 Xinghua Qu, Hongyang Liu, Zhu Sun, Xiang Yin, Yew Soon Ong, Lu Lu, Zejun Ma

Conversational recommender systems (CRSs) have become crucial emerging research topics in the field of RSs, thanks to their natural advantages of explicitly acquiring user preferences via interactive conversations and revealing the reasons behind recommendations.

Recommendation Systems

Mega-TTS: Zero-Shot Text-to-Speech at Scale with Intrinsic Inductive Bias

no code implementations6 Jun 2023 Ziyue Jiang, Yi Ren, Zhenhui Ye, Jinglin Liu, Chen Zhang, Qian Yang, Shengpeng Ji, Rongjie Huang, Chunfeng Wang, Xiang Yin, Zejun Ma, Zhou Zhao

3) We further use a VQGAN-based acoustic model to generate the spectrogram and a latent code language model to fit the distribution of prosody, since prosody changes quickly over time in a sentence, and language models can capture both local and long-range dependencies.

Attribute Inductive Bias +3

Detector Guidance for Multi-Object Text-to-Image Generation

1 code implementation4 Jun 2023 Luping Liu, Zijian Zhang, Yi Ren, Rongjie Huang, Xiang Yin, Zhou Zhao

Previous works identify the problem of information mixing in the CLIP text encoder and introduce the T5 text encoder or incorporate strong prior knowledge to assist with the alignment.

Object object-detection +2

Make-An-Audio 2: Temporal-Enhanced Text-to-Audio Generation

no code implementations29 May 2023 Jiawei Huang, Yi Ren, Rongjie Huang, Dongchao Yang, Zhenhui Ye, Chen Zhang, Jinglin Liu, Xiang Yin, Zejun Ma, Zhou Zhao

Finally, we use LLMs to augment and transform a large amount of audio-label data into audio-text datasets to alleviate the problem of scarcity of temporal data.

Audio Generation Denoising +2

AV-TranSpeech: Audio-Visual Robust Speech-to-Speech Translation

no code implementations24 May 2023 Rongjie Huang, Huadai Liu, Xize Cheng, Yi Ren, Linjun Li, Zhenhui Ye, Jinzheng He, Lichao Zhang, Jinglin Liu, Xiang Yin, Zhou Zhao

Direct speech-to-speech translation (S2ST) aims to convert speech from one language into another, and has demonstrated significant progress to date.

Speech-to-Speech Translation Translation

GeneFace++: Generalized and Stable Real-Time Audio-Driven 3D Talking Face Generation

no code implementations1 May 2023 Zhenhui Ye, Jinzheng He, Ziyue Jiang, Rongjie Huang, Jiawei Huang, Jinglin Liu, Yi Ren, Xiang Yin, Zejun Ma, Zhou Zhao

Recently, neural radiance field (NeRF) has become a popular rendering technique in this field since it could achieve high-fidelity and 3D-consistent talking face generation with a few-minute-long training video.

motion prediction Talking Face Generation

Data-Driven Safe Controller Synthesis for Deterministic Systems: A Posteriori Method With Validation Tests

no code implementations3 Apr 2023 Yu Chen, Chao Shang, Xiaolin Huang, Xiang Yin

We first formulate the safety synthesis problem as a robust convex program (RCP) based on notion of control barrier function.

LiteG2P: A fast, light and high accuracy model for grapheme-to-phoneme conversion

no code implementations2 Mar 2023 Chunfeng Wang, Peisong Huang, Yuxiang Zou, Haoyu Zhang, Shichao Liu, Xiang Yin, Zejun Ma

As a key component of automated speech recognition (ASR) and the front-end in text-to-speech (TTS), grapheme-to-phoneme (G2P) plays the role of converting letters to their corresponding pronunciations.

speech-recognition Speech Recognition

Make-An-Audio: Text-To-Audio Generation with Prompt-Enhanced Diffusion Models

1 code implementation30 Jan 2023 Rongjie Huang, Jiawei Huang, Dongchao Yang, Yi Ren, Luping Liu, Mingze Li, Zhenhui Ye, Jinglin Liu, Xiang Yin, Zhou Zhao

Its application to audio still lags behind for two main reasons: the lack of large-scale datasets with high-quality text-audio pairs, and the complexity of modeling long continuous audio data.

Audio Generation Text-to-Video Generation +1

Virtual Try-On with Pose-Garment Keypoints Guided Inpainting

1 code implementation ICCV 2023 Zhi Li, Pengfei Wei, Xiang Yin, Zejun Ma, Alex C. Kot

In our method, human pose and garment keypoints are extracted from source images and constructed as graphs to predict the garment keypoints at the target pose.

Virtual Try-on

Direct Speech-to-speech Translation without Textual Annotation using Bottleneck Features

no code implementations12 Dec 2022 Junhui Zhang, Junjie Pan, Xiang Yin, Zejun Ma

Speech-to-speech translation directly translates a speech utterance to another between different languages, and has great potential in tasks such as simultaneous interpretation.

Speech-to-Speech Translation Translation

You Don't Know When I Will Arrive: Unpredictable Controller Synthesis for Temporal Logic Tasks

no code implementations23 Nov 2022 Yu Chen, Shuo Yang, Rahul Mangharam, Xiang Yin

This problem is particularly challenging since future information is involved in the synthesis process.

Robot Task Planning

Markov decision processes with maximum entropy rate for Surveillance Tasks

no code implementations23 Nov 2022 Yu Chen, ShaoYuan Li, Xiang Yin

We consider the problem of synthesizing optimal policies for Markov decision processes (MDP) for both utility objective and security constraint.

Explaining Random Forests using Bipolar Argumentation and Markov Networks (Technical Report)

no code implementations21 Nov 2022 Nico Potyka, Xiang Yin, Francesca Toni

Random forests are decision tree ensembles that can be used to solve a variety of machine learning problems.

Decision Making

Model Predictive Control for Signal Temporal Logic Specifications with Time Interval Decomposition

1 code implementation15 Nov 2022 Xinyi Yu, Chuwei Wang, Dingran Yuan, ShaoYuan Li, Xiang Yin

However, instead of applying MPC directly for the entire task horizon, we decompose the STL formula into several sub-formulae with disjoint time horizons, and shrinking horizon MPC is applied for each short-horizon sub-formula iteratively.

Computational Efficiency Model Predictive Control

Abstraction-Based Verification of Approximate Pre-Opacity for Control Systems

no code implementations8 Nov 2022 Junyao Hou, Siyuan Liu, Xiang Yin, Majid Zamani

In this paper, we first introduce a concept of approximate pre-opacity by capturing the security level of control systems with respect to the measurement precision of the intruder.

Model Predictive Monitoring of Dynamical Systems for Signal Temporal Logic Specifications

1 code implementation26 Sep 2022 Xinyi Yu, Weijie Dong, Xiang Yin, ShaoYuan Li

To this end, effective approaches for the computation of feasible sets of STL formulae are provided.

Unsupervised Video Domain Adaptation for Action Recognition: A Disentanglement Perspective

1 code implementation NeurIPS 2023 Pengfei Wei, Lingdong Kong, Xinghua Qu, Yi Ren, Zhiqiang Xu, Jing Jiang, Xiang Yin

Specifically, we consider the generation of cross-domain videos from two sets of latent factors, one encoding the static information and another encoding the dynamic information.

Action Recognition Disentanglement +1

A Novel Chinese Dialect TTS Frontend with Non-Autoregressive Neural Machine Translation

no code implementations10 Jun 2022 Junhui Zhang, Wudi Bao, Junjie Pan, Xiang Yin, Zejun Ma

In this paper, we propose a novel Chinese dialect TTS frontend with a translation module, which converts Mandarin text into dialectic expressions to improve the intelligibility and naturalness of synthesized speech.

Machine Translation Translation

Towards a Theory of Faithfulness: Faithful Explanations of Differentiable Classifiers over Continuous Data

no code implementations19 May 2022 Nico Potyka, Xiang Yin, Francesca Toni

There is broad agreement in the literature that explanation methods should be faithful to the model that they explain, but faithfulness remains a rather vague term.

A Unified Framework for Verification of Observational Properties for Partially-Observed Discrete-Event Systems

no code implementations3 May 2022 Jianing Zhao, Xiang Yin, ShaoYuan Li

However, in contrast to existing results, where different verification procedures are developed for different properties case-by-case, in this work, we provide a unified framework for verifying all these properties by reducing each of them as an instance of HyperLTL model checking.

A Uniform Framework for Diagnosis of Discrete-Event Systems with Unreliable Sensors using Linear Temporal Logic

no code implementations27 Apr 2022 Weijie Dong, Xiang Yin, ShaoYuan Li

In this work, we propose a novel \emph{uniform framework} for diagnosability of DES subject to, not only sensor failures, but also a very general class of unreliable sensors.

Fault Diagnosis of Discrete-Event Systems under Non-Deterministic Observations with Output Fairness

no code implementations6 Apr 2022 Weijie Dong, Shang Gao, Xiang Yin, ShaoYuan Li

Non-deterministic observation is a general observation model that includes the case of intermittent loss of observations.

Fairness

To Explore or Not to Explore: Regret-Based LTL Planning in Partially-Known Environments

no code implementations1 Apr 2022 Jianing Zhao, Keyi Zhu, Mingyang Feng, Xiang Yin

In contrast to the standard game-based approach that optimizes the worst-case cost, in the paper, we propose to use regret as a new metric for planning in such a partially-known environment.

You Don't Know What I Know: On Notion of High-Order Opacity in Discrete-Event Systems

no code implementations31 Mar 2022 Bohan Cui, Xiang Yin, ShaoYuan Li, Alessandro Giua

In this paper, we consider a new type of secret related to the knowledge of the system user.

Sensor Deception Attacks Against Initial-State Privacy in Supervisory Control Systems

no code implementations31 Mar 2022 Jingshi Yao, Xiang Yin, ShaoYuan Li

Specifically, we consider an active attacker that can tamper with the observations received by the supervisor by, e. g., hacking on the communication channel between the sensors and the supervisor.

Online Monitoring of Dynamic Systems for Signal Temporal Logic Specifications with Model Information

no code implementations30 Mar 2022 Xinyi Yu, Weijie Dong, Xiang Yin, ShaoYuan Li

We show that, by explicitly utilizing the model information of the dynamic system, the proposed online monitoring algorithm can falsify or certify of the specification in advance compared with existing algorithms, where no model information is used.

Towards Realistic Visual Dubbing with Heterogeneous Sources

no code implementations17 Jan 2022 Tianyi Xie, Liucheng Liao, Cheng Bi, Benlai Tang, Xiang Yin, Jianfei Yang, Mingjie Wang, Jiali Yao, Yang Zhang, Zejun Ma

The task of few-shot visual dubbing focuses on synchronizing the lip movements with arbitrary speech input for any talking head video.

Disentanglement Talking Head Generation

Data privacy protection in microscopic image analysis for material data mining

no code implementations9 Nov 2021 Boyuan Ma, Xiang Yin, Xiaojuan Ban, Haiyou Huang, Neng Zhang, Hao Wang, Weihua Xue

The core contributions are as follows: 1) the federated learning algorithm is introduced into the polycrystalline microstructure image segmentation task to make full use of different user data to carry out machine learning, break the data island and improve the model generalization ability under the condition of ensuring the privacy and security of user data; 2) A data sharing strategy based on style transfer is proposed.

Federated Learning Image Segmentation +2

Towards Using Clothes Style Transfer for Scenario-aware Person Video Generation

1 code implementation14 Oct 2021 Jingning Xu, Benlai Tang, Mingjie Wang, Siyuan Bian, Wenyi Guo, Xiang Yin, Zejun Ma

To tackle this problem, most recent AdaIN-based architectures are proposed to extract clothes and scenario features for generation.

Style Transfer Video Generation

Towards High-fidelity Singing Voice Conversion with Acoustic Reference and Contrastive Predictive Coding

no code implementations10 Oct 2021 Chao Wang, Zhonghao Li, Benlai Tang, Xiang Yin, Yuan Wan, Yibiao Yu, Zejun Ma

Experiments show that, compared with the baseline models, our proposed model can significantly improve the naturalness of converted singing voices and the similarity with the target singer.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Personalized Heterogeneous Federated Learning with Gradient Similarity

no code implementations29 Sep 2021 Jing Xie, Xiang Yin, Xiyi Zhang, Juan Chen, Quan Wen, Qiang Yang, Xuan Mo

In SPFL, the server uses the Softmax Normalized Gradient Similarity (SNGS) to weight the relationship between clients, and sends the personalized global model to each client.

Federated Learning

Cloud-Assisted Nonlinear Model Predictive Control for Finite-Duration Tasks

no code implementations20 Jun 2021 Nan Li, Kaixiang Zhang, Zhaojian Li, Vaibhav Srivastava, Xiang Yin

In this paper, we propose a novel cloud-assisted model predictive control (MPC) framework in which we systematically fuse a cloud MPC that uses a high-fidelity nonlinear model but is subject to communication delays with a local MPC that exploits simplified dynamics (due to limited computation) but has timely feedback.

Cloud Computing Model Predictive Control

Optimal Synthesis of Opacity-Enforcing Supervisors for Qualitative and Quantitative Specifications

no code implementations2 Feb 2021 Yifan Xie, Xiang Yin, ShaoYuan Li

We assume that the system has a "secret" that does not want to be revealed to the intruder.

A Framework for Current-State Opacity under Dynamic Information Release Mechanism

no code implementations9 Dec 2020 Junyao Hou, Xiang Yin, ShaoYuan Li

In this paper, we investigate the verification of current-state opacity for discrete-event systems under Orwellian-type observations, i. e., the system is allowed to re-interpret the observation of an event based on its future suffix.

PPG-based singing voice conversion with adversarial representation learning

no code implementations28 Oct 2020 Zhonghao Li, Benlai Tang, Xiang Yin, Yuan Wan, Ling Xu, Chen Shen, Zejun Ma

Singing voice conversion (SVC) aims to convert the voice of one singer to that of other singers while keeping the singing content and melody.

Representation Learning Voice Conversion +1

Opacity Enforcing Supervisory Control using Non-deterministic Supervisors

no code implementations20 Oct 2020 Yifan Xie, Xiang Yin, ShaoYuan Li

Compared with the standard deterministic control mechanism, such a non-deterministic control mechanism can enhance the plausible deniability of the controlled system as the online control decision is a random realization and cannot be implicitly inferred from the control policy.

End-to-End Learning for Simultaneously Generating Decision Map and Multi-Focus Image Fusion Result

2 code implementations17 Oct 2020 Boyuan Ma, Xiang Yin, Di wu, Xiaojuan Ban

In this work, to handle the requirements of both output image quality and comprehensive simplicity of structure implementation, we propose a cascade network to simultaneously generate decision map and fused result with an end-to-end training procedure.

2D Cyclist Detection

Xiaomingbot: A Multilingual Robot News Reporter

no code implementations ACL 2020 Runxin Xu, Jun Cao, Mingxuan Wang, Jiaze Chen, Hao Zhou, Ying Zeng, Yu-Ping Wang, Li Chen, Xiang Yin, Xijin Zhang, Songcheng Jiang, Yuxuan Wang, Lei LI

This paper proposes the building of Xiaomingbot, an intelligent, multilingual and multimodal software robot equipped with four integral capabilities: news generation, news translation, news reading and avatar animation.

News Generation Translation +1

Improving Accent Conversion with Reference Encoder and End-To-End Text-To-Speech

no code implementations19 May 2020 Wenjie Li, Benlai Tang, Xiang Yin, Yushi Zhao, Wei Li, Kang Wang, Hao Huang, Yuxuan Wang, Zejun Ma

Accent conversion (AC) transforms a non-native speaker's accent into a native accent while maintaining the speaker's voice timbre.

ByteSing: A Chinese Singing Voice Synthesis System Using Duration Allocated Encoder-Decoder Acoustic Models and WaveRNN Vocoders

no code implementations23 Apr 2020 Yu Gu, Xiang Yin, Yonghui Rao, Yuan Wan, Benlai Tang, Yang Zhang, Jitong Chen, Yuxuan Wang, Zejun Ma

This paper presents ByteSing, a Chinese singing voice synthesis (SVS) system based on duration allocated Tacotron-like acoustic models and WaveRNN neural vocoders.

Singing Voice Synthesis

A unified sequence-to-sequence front-end model for Mandarin text-to-speech synthesis

no code implementations11 Nov 2019 Junjie Pan, Xiang Yin, Zhiling Zhang, Shichao Liu, Yang Zhang, Zejun Ma, Yuxuan Wang

In Mandarin text-to-speech (TTS) system, the front-end text processing module significantly influences the intelligibility and naturalness of synthesized speech.

Polyphone disambiguation Speech Synthesis +1

Cannot find the paper you are looking for? You can Submit a new open access paper.