Search Results for author: Xu Li

Found 45 papers, 10 papers with code

CLAPSep: Leveraging Contrastive Pre-trained Models for Multi-Modal Query-Conditioned Target Sound Extraction

1 code implementation27 Feb 2024 Hao Ma, Zhiyuan Peng, Mingjie Shao, Ju Liu, Xu Li, Xixin Wu

Such systems consist of two components: a query network that converts user queries into conditional embeddings, and a separation network that extracts the target sound based on conditional embeddings.

Target Sound Extraction

Parametric Feature Transfer: One-shot Federated Learning with Foundation Models

no code implementations2 Feb 2024 Mahdi Beitollahi, Alex Bie, Sobhan Hemati, Leo Maxime Brunswic, Xu Li, Xi Chen, Guojun Zhang

This paper introduces FedPFT (Federated Learning with Parametric Feature Transfer), a methodology that harnesses the transferability of foundation models to enhance both accuracy and communication efficiency in one-shot FL.

Federated Learning

DFML: Decentralized Federated Mutual Learning

no code implementations2 Feb 2024 Yasser H. Khalil, Amir H. Estiri, Mahdi Beitollahi, Nader Asadi, Sobhan Hemati, Xu Li, Guojun Zhang, Xi Chen

In the realm of real-world devices, centralized servers in Federated Learning (FL) present challenges including communication bottlenecks and susceptibility to a single point of failure.

Federated Learning

Recovering Linear Causal Models with Latent Variables via Cholesky Factorization of Covariance Matrix

no code implementations1 Nov 2023 Yunfeng Cai, Xu Li, Minging Sun, Ping Li

In this paper, we first propose a DAG structure recovering algorithm, which is based on the Cholesky factorization of the covariance matrix of the observed data.

Enhancing the vocal range of single-speaker singing voice synthesis with melody-unsupervised pre-training

no code implementations1 Sep 2023 Shaohuan Zhou, Xu Li, Zhiyong Wu, Ying Shan, Helen Meng

Specifically, in the pre-training step, we design a phoneme predictor to produce the frame-level phoneme probability vectors as the phonemic timing information and a speaker encoder to model the timbre variations of different singers, and directly estimate the frame-level f0 values from the audio to provide the pitch information.

Singing Voice Synthesis Unsupervised Pre-training

The Impacts of Registration Regime Implementation on IPO Pricing Efficiency

no code implementations18 Jul 2023 Qi Deng, Linhong Zheng, Jiaqi Peng, Xu Li, Zhong-guo Zhou, Monica Hussein, Dingyi Chen, Mick Swartz

We find that the most efficient regulation regime in Chinese IPO pricing has four characteristics: 1) registration system, 2) no hard return caps nor trading curbs that restrict the initial return; 3) more specific listing rules for issuers, and 4) more stringent participation requirements for investors.

Inferring Gene Regulatory Neural Networks for Bacterial Decision Making in Biofilms

no code implementations10 Jan 2023 Samitha Somathilaka, Daniel P. Martins, Xu Li, Yusong Li, Sasitharan Balasubramaniam

These incoming external signals are then processed using a Gene Regulatory Network (GRN), exhibiting similarities to modern computing algorithms.

Decision Making

Quantitative Evidence on Overlooked Aspects of Enrollment Speaker Embeddings for Target Speaker Separation

no code implementations23 Oct 2022 Xiaoyu Liu, Xu Li, Joan Serrà

Single channel target speaker separation (TSS) aims at extracting a speaker's voice from a mixture of multiple talkers given an enrollment utterance of that speaker.

Speaker Identification Speaker Separation

DeepSTI: Towards Tensor Reconstruction using Fewer Orientations in Susceptibility Tensor Imaging

no code implementations9 Sep 2022 Zhenghan Fang, Kuo-Wei Lai, Peter van Zijl, Xu Li, Jeremias Sulam

Experimental results using both simulation and in vivo human data demonstrate great improvement over state-of-the-art algorithms in terms of the reconstructed tensor image, principal eigenvector maps and tractography results, while allowing for tensor reconstruction with MR phase measured at much less than six different orientations.

Image Reconstruction

A Hierarchical Speaker Representation Framework for One-shot Singing Voice Conversion

no code implementations28 Jun 2022 Xu Li, Shansong Liu, Ying Shan

It is suspected that a single embedding vector may only capture averaged and coarse-grained speaker characteristics, which is insufficient for the SVC task.

Speaker Recognition Voice Conversion

CGAR: Critic Guided Action Redistribution in Reinforcement Leaning

1 code implementation23 Jun 2022 Tairan Huang, Xu Li, Hao Li, Mingming Sun, Ping Li

As discussed in this paper, under the settings of the off-policy actor critic algorithms, we demonstrate that the critic can bring more expected discounted rewards than or at least equal to the actor.

Reinforcement Learning (RL)

Joint learning of object graph and relation graph for visual question answering

no code implementations9 May 2022 Hao Li, Xu Li, Belhal Karimi, Jie Chen, Mingming Sun

Modeling visual question answering(VQA) through scene graphs can significantly improve the reasoning accuracy and interpretability.

Attribute Question Answering +2

Spoofing-Aware Speaker Verification by Multi-Level Fusion

no code implementations29 Mar 2022 Haibin Wu, Lingwei Meng, Jiawen Kang, Jinchao Li, Xu Li, Xixin Wu, Hung-Yi Lee, Helen Meng

In the second-level fusion, the CM score and ASV scores directly from ASV systems will be concatenated into a prediction block for the final decision.

Speaker Verification

PathSAGE: Spatial Graph Attention Neural Networks With Random Path Sampling

no code implementations11 Mar 2022 Junhua Ma, Jiajun Li, Xueming Li, Xu Li

To address these problems, we propose a model called PathSAGE, which can learn high-order topological information and improve the model's performance by expanding the receptive field.

Graph Attention

Characterizing the adversarial vulnerability of speech self-supervised learning

no code implementations8 Nov 2021 Haibin Wu, Bo Zheng, Xu Li, Xixin Wu, Hung-Yi Lee, Helen Meng

As the paradigm of the self-supervised learning upstream model followed by downstream tasks arouses more attention in the speech community, characterizing the adversarial robustness of such paradigm is of high priority.

Adversarial Robustness Benchmarking +2

Arbitrary Distribution Modeling with Censorship in Real-Time Bidding Advertising

1 code implementation26 Oct 2021 Xu Li, Michelle Ma Zhang, Youjun Tong, Zhenya Wang

The purpose of Inventory Pricing is to bid the right prices to online ad opportunities, which is crucial for a Demand-Side Platform (DSP) to win advertising auctions in Real-Time Bidding (RTB).

Causal Discovery via Cholesky Factorization

no code implementations29 Sep 2021 Xu Li, Yunfeng Cai, Mingming Sun, Ping Li

Discovering the causal relationship via recovering the directed acyclic graph (DAG) structure from the observed data is a challenging combinatorial problem.

Causal Discovery

S$^2$-MLPv2: Improved Spatial-Shift MLP Architecture for Vision

3 code implementations2 Aug 2021 Tan Yu, Xu Li, Yunfeng Cai, Mingming Sun, Ping Li

More recently, using smaller patches with a pyramid structure, Vision Permutator (ViP) and Global Filter Network (GFNet) achieve better performance than S$^2$-MLP.

Inductive Bias

Channel-wise Gated Res2Net: Towards Robust Detection of Synthetic Speech Attacks

2 code implementations19 Jul 2021 Xu Li, Xixin Wu, Hui Lu, Xunying Liu, Helen Meng

This argument motivates the current work that presents a novel, channel-wise gated Res2Net (CG-Res2Net), which modifies Res2Net to enable a channel-wise gating mechanism in the connection between feature groups.

Speaker Verification

Rethinking Token-Mixing MLP for MLP-based Vision Backbone

no code implementations28 Jun 2021 Tan Yu, Xu Li, Yunfeng Cai, Mingming Sun, Ping Li

By introducing the inductive bias from the image processing, convolution neural network (CNN) has achieved excellent performance in numerous computer vision tasks and has been established as \emph{de facto} backbone.

Inductive Bias

A Scalable 256-Elements E-Band Phased-Array Transceiver for Broadband Communication

no code implementations20 Jun 2021 Xu Li, Wenyao Zhai, Morris Repeta, Hua Cai, Tyler Ross, Kimia Ansari, Sam Tiller, Hari Krishna Pothula, Dong Liang, Fan Yang, Yibo Lyu, Songlin Shuai, Guangjian Wang, Wen Tong

For E-band wireless communications, a high gain steerable antenna with sub-arrays is desired to reduce the implementation complexity.

S$^2$-MLP: Spatial-Shift MLP Architecture for Vision

1 code implementation14 Jun 2021 Tan Yu, Xu Li, Yunfeng Cai, Mingming Sun, Ping Li

We discover that the token-mixing MLP is a variant of the depthwise convolution with a global reception field and spatial-specific configuration.

Dynamic RAN Slicing for Service-Oriented Vehicular Networks via Constrained Learning

no code implementations3 Dec 2020 Wen Wu, Nan Chen, Conghao Zhou, Mushu Li, Xuemin Shen, Weihua Zhuang, Xu Li

To obtain an optimal RAN slicing policy for accommodating the spatial-temporal dynamics of vehicle traffic density, we first formulate a constrained RAN slicing problem with the objective to minimize long-term system cost.

Reinforcement Learning (RL)

Two Types of Mixed Orthogonal Frequency Division Multiplexing (X-OFDM) Waveform for Optical Wireless Communication

no code implementations7 Nov 2020 Xu Li, Jingjing Huang, Yibo Lyu, Rui Ni, Jiajin Luo, Junping Zhang

For the even sub-carriers in the frequency domain, the signal in the time domain after the IFFT is symmetric.

Replay and Synthetic Speech Detection with Res2net Architecture

2 code implementations28 Oct 2020 Xu Li, Na Li, Chao Weng, Xunying Liu, Dan Su, Dong Yu, Helen Meng

This multiple scaling mechanism significantly improves the countermeasure's generalizability to unseen spoofing attacks.

Feature Engineering Synthetic Speech Detection

Scalable Adversarial Attack on Graph Neural Networks with Alternating Direction Method of Multipliers

no code implementations22 Sep 2020 Boyuan Feng, yuke wang, Xu Li, Yufei Ding

Graph neural networks (GNNs) have achieved high performance in analyzing graph-structured data and have been widely deployed in safety-critical areas, such as finance and autonomous driving.

Adversarial Attack Autonomous Driving

Learned Proximal Networks for Quantitative Susceptibility Mapping

1 code implementation11 Aug 2020 Kuo-Wei Lai, Manisha Aggarwal, Peter van Zijl, Xu Li, Jeremias Sulam

More importantly, this framework is believed to be the first deep learning QSM approach that can naturally handle an arbitrary number of phase input measurements without the need for any ad-hoc rotation or re-training.

Image Reconstruction

SGQuant: Squeezing the Last Bit on Graph Neural Networks with Specialized Quantization

no code implementations9 Jul 2020 Boyuan Feng, yuke wang, Xu Li, Shu Yang, Xueqiao Peng, Yufei Ding

With the increasing popularity of graph-based learning, Graph Neural Networks (GNNs) win lots of attention from the research and industry field because of their high accuracy.

Quantization

Investigating Robustness of Adversarial Samples Detection for Automatic Speaker Verification

no code implementations11 Jun 2020 Xu Li, Na Li, Jinghua Zhong, Xixin Wu, Xunying Liu, Dan Su, Dong Yu, Helen Meng

Orthogonal to prior approaches, this work proposes to defend ASV systems against adversarial attacks with a separate detection network, rather than augmenting adversarial data into ASV training.

Binary Classification Data Augmentation +1

Bayesian x-vector: Bayesian Neural Network based x-vector System for Speaker Verification

no code implementations8 Apr 2020 Xu Li, Jinghua Zhong, Jianwei Yu, Shoukang Hu, Xixin Wu, Xunying Liu, Helen Meng

Our experiment results indicate that the DNN x-vector system could benefit from BNNs especially when the mismatch problem is severe for evaluations using out-of-domain data.

Speaker Verification

Meta-CoTGAN: A Meta Cooperative Training Paradigm for Improving Adversarial Text Generation

no code implementations12 Mar 2020 Haiyan Yin, Dingcheng Li, Xu Li, Ping Li

To this end, we introduce a cooperative training paradigm, where a language model is cooperatively trained with the generator and we utilize the language model to efficiently shape the data distribution of the generator against mode collapse.

Adversarial Text Language Modelling +2

Deep segmental phonetic posterior-grams based discovery of non-categories in L2 English speech

no code implementations1 Feb 2020 Xu Li, Xixin Wu, Xunying Liu, Helen Meng

And then we explore the non-categories by looking for the SPPGs with more than one peak.

FFusionCGAN: An end-to-end fusion method for few-focus images using conditional GAN in cytopathological digital slides

1 code implementation3 Jan 2020 Xiebo Geng, Sibo Liua, Wei Han, Xu Li, Jiabo Ma, Jingya Yu, Xiuli Liu, Sahoqun Zeng, Li Chen, Shenghua Cheng

However, although existing image fusion techniques, including traditional algorithms and deep learning-based algorithms, can generate high-quality fused images, they need multiple images with different focus depths in the same field of view.

Generative Adversarial Network Semantic Segmentation +1

Adversarial Attacks on GMM i-vector based Speaker Verification Systems

2 code implementations8 Nov 2019 Xu Li, Jinghua Zhong, Xixin Wu, Jianwei Yu, Xunying Liu, Helen Meng

Experiment results show that GMM i-vector systems are seriously vulnerable to adversarial attacks, and the crafted adversarial samples prove to be transferable and pose threats to neuralnetwork speaker embedding based systems (e. g. x-vector systems).

Speaker Verification

Word embedding re-examined: is the symmetrical factorization optimal?

no code implementations25 Sep 2019 Zhichao Han, Jia Li, Xu Li, Hong Cheng

Such linear transformation will result in these good properties.

Logician: A Unified End-to-End Neural Approach for Open-Domain Information Extraction

no code implementations29 Apr 2019 Mingming Sun, Xu Li, Xin Wang, Miao Fan, Yue Feng, Ping Li

In this paper, we consider the problem of open information extraction (OIE) for extracting entity and relation level intermediate structures from sentences in open-domain.

Attribute Open Information Extraction +3

Segmentation of Levator Hiatus Using Multi-Scale Local Region Active contours and Boundary Shape Similarity Constraint

no code implementations11 Jan 2019 Xinling Zhang, Xu Li, Ying Chen, Yixin Gan, Dexing Kong, Rongqin Zheng

In this paper, a multi-scale framework with local region based active contour and boundary shape similarity constraint is proposed for the segmentation of levator hiatus in ultrasound images.

Segmentation

Global optimization of expensive black-box models based on asynchronous hybrid-criterion with interval reduction

no code implementations29 Nov 2018 Chunlin Gong, Xu Li, Hua Su, Jinlei Guo, Liangxian Gu

Third, to accelerate the local searching efficiency, searching for a local optimum with sequential quadratic programming (SQP) based on the local surrogate models in the reduced interval, which involves some samples near the current optimum.

Logician and Orator: Learning from the Duality between Language and Knowledge in Open Domain

no code implementations EMNLP 2018 Mingming Sun, Xu Li, Ping Li

We propose the task of Open-Domain Information Narration (OIN) as the reverse task of Open Information Extraction (OIE), to implement the dual structure between language and knowledge in the open domain.

Open Information Extraction reinforcement-learning +2

Noise Robust IOA/CAS Speech Separation and Recognition System For The Third 'CHIME' Challenge

no code implementations21 Sep 2015 Xiaofei Wang, Chao Wu, Pengyuan Zhang, Ziteng Wang, Yong liu, Xu Li, Qiang Fu, Yonghong Yan

This paper presents the contribution to the third 'CHiME' speech separation and recognition challenge including both front-end signal processing and back-end speech recognition.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Cannot find the paper you are looking for? You can Submit a new open access paper.