no code implementations • 19 Apr 2024 • Zhaoxi Mu, Xinyu Yang
In audio-visual target speech extraction tasks, the audio modality tends to dominate, potentially overshadowing the importance of visual guidance.
1 code implementation • 18 Apr 2024 • Hanshi Sun, Zhuoming Chen, Xinyu Yang, Yuandong Tian, Beidi Chen
However, key-value (KV) cache, which is stored to avoid re-computation, has emerged as a critical bottleneck by growing linearly in size with the sequence length.
1 code implementation • 27 Feb 2024 • Xinyu Yang, Hossein Rahmani, Sue Black, Bryan M. Williams
Class activation maps (CAMs) are commonly employed in weakly supervised semantic segmentation (WSSS) to produce pseudo-labels.
Weakly supervised Semantic Segmentation Weakly-Supervised Semantic Segmentation
1 code implementation • 14 Feb 2024 • Harry Dong, Xinyu Yang, Zhenyu Zhang, Zhangyang Wang, Yuejie Chi, Beidi Chen
Many computational factors limit broader deployment of large language models.
1 code implementation • 7 Feb 2024 • Weixin Liang, Nazneen Rajani, Xinyu Yang, Ezinwanne Ozoani, Eric Wu, Yiqun Chen, Daniel Scott Smith, James Zou
To evaluate the impact of model cards, we conducted an intervention study by adding detailed model cards to 42 popular models which had no or sparse model cards previously.
no code implementations • 24 Jan 2024 • Xinyu Yang, Jizhe Zhou
In recent years, particularly since the early 2020s, Large Language Models (LLMs) have emerged as the most powerful AI tools in addressing a diverse range of challenges, from natural language processing to complex problem-solving in various domains.
no code implementations • 24 Jan 2024 • Otto Brookes, Majid Mirmehdi, Colleen Stephens, Samuel Angedakin, Katherine Corogenes, Dervla Dowd, Paula Dieguez, Thurston C. Hicks, Sorrel Jones, Kevin Lee, Vera Leinert, Juan Lapuente, Maureen S. McCarthy, Amelia Meier, Mizuki Murai, Emmanuelle Normand, Virginie Vergnes, Erin G. Wessling, Roman M. Wittig, Kevin Langergraber, Nuria Maldonado, Xinyu Yang, Klaus Zuberbuhler, Christophe Boesch, Mimi Arandjelovic, Hjalmar Kuhl, Tilo Burghardt
We present the PanAf20K dataset, the largest and most diverse open-access annotated video dataset of great apes in their natural environment.
1 code implementation • 24 Jan 2024 • Xinyu Yang, Weixin Liang, James Zou
By analyzing all 7, 433 dataset documentation on Hugging Face, our investigation provides an overview of the Hugging Face dataset ecosystem and insights into dataset documentation practices, yielding 5 main findings: (1) The dataset card completion rate shows marked heterogeneity correlated with dataset popularity.
1 code implementation • 16 Jan 2024 • Hang Chen, Xinyu Yang, Keqing Du
These highpoints make the probabilistic model capable of overcoming challenges brought by the coexistence of multi-structure data and multi-value representations and pave the way for the extension of latent confounders.
no code implementations • 17 Dec 2023 • Xinyu Yang, Hongbo Bo
Face swapping has gained significant traction, driven by the plethora of human face synthesis facilitated by deep learning methods.
1 code implementation • 16 Dec 2023 • Shulei Ji, Xinyu Yang
However, prior research on deep learning-based emotional music generation has rarely explored the contribution of different musical elements to emotions, let alone the deliberate manipulation of these elements to alter the emotion of music, which is not conducive to fine-grained element-level control over emotions.
no code implementations • 16 Dec 2023 • Zhaoxi Mu, Xinyu Yang, Sining Sun, Qing Yang
However, in the task of target speech extraction, certain elements of global and local semantic information in the reference speech, which are irrelevant to speaker identity, can lead to speaker confusion within the speech extraction network.
no code implementations • 21 Nov 2023 • Keqing Du, Xinyu Yang, Hang Chen
CASR works out by reducing the difference in the causal adjacency matrix between we constructed and pre-segmentation results of backbone models.
1 code implementation • 6 Nov 2023 • Chenhang Cui, Yiyang Zhou, Xinyu Yang, Shirley Wu, Linjun Zhang, James Zou, Huaxiu Yao
To bridge this gap, we introduce a new benchmark, namely, the Bias and Interference Challenges in Visual Language Models (Bingo).
no code implementations • 2 Nov 2023 • Hang Chen, Keqing Du, Chenguang Li, Xinyu Yang
The fusion of causal models with deep learning introducing increasingly intricate data sets, such as the causal associations within images or between textual components, has surfaced as a focal research area.
no code implementations • 28 Oct 2023 • Hang Chen, Xinyu Yang, Keqing Du
The cross-pollination of deep learning and causal discovery has catalyzed a burgeoning field of research seeking to elucidate causal relationships within non-statistical data forms like images, videos, and text.
no code implementations • 23 Oct 2023 • Xinyu Yang, Jizhe Zhou
And then use the modified total variation noise reduction technology to process the subtracted image.
1 code implementation • 3 Oct 2023 • Weixin Liang, Yuhui Zhang, Hancheng Cao, Binglu Wang, Daisy Ding, Xinyu Yang, Kailas Vodrahalli, Siyu He, Daniel Smith, Yian Yin, Daniel McFarland, James Zou
We first quantitatively compared GPT-4's generated feedback with human peer reviewer feedback in 15 Nature family journals (3, 096 papers in total) and the ICLR machine learning conference (1, 709 papers).
1 code implementation • 25 Sep 2023 • Zekun Cai, Renhe Jiang, Xinyu Yang, Zhaonan Wang, Diansheng Guo, Hiroki Kobayashi, Xuan Song, Ryosuke Shibasaki
Urban time series data forecasting featuring significant contributions to sustainable development is widely studied as an essential task of the smart city.
Ranked #1 on Traffic Prediction on Beijing Traffic
no code implementations • 6 Jun 2023 • Shulei Ji, Xinyu Yang
To solve these problems, we propose a novel LSTM-based Hierarchical Variational Auto-Encoder (LHVAE) to investigate the influence of emotional conditions on melody harmonization, while improving the quality of generated harmonies and capturing the abundant variability of chord progressions.
1 code implementation • 28 May 2023 • Hang Chen, Bingyu Liao, Jing Luo, Wenjing Zhu, Xinyu Yang
Reasoning, a crucial aspect of NLP research, has not been adequately addressed by prevailing models including Large Language Model.
1 code implementation • 4 May 2023 • Weixin Liang, Yining Mao, Yongchan Kwon, Xinyu Yang, James Zou
Our work highlights the importance of understanding the nonlinear effects of model improvement on performance in different subpopulations, and has the potential to inform the development of more equitable and responsible machine learning models.
1 code implementation • 4 May 2023 • Hang Chen, Jing Luo, Xinyu Yang, Wenjing Zhu
noise terms into the conversation process, thereby constructing a structural causal model (SCM).
no code implementations • 4 May 2023 • Hang Chen, Xinyu Yang, Qing Yang
We implement the above designs as a dynamic variational inference model, tailored to learn causal representation from indefinite data under latent confounding.
2 code implementations • 27 Mar 2023 • Xiangyuan Yang, Jie Lin, HANLIN ZHANG, Xinyu Yang, Peng Zhao
Although considerable efforts have been developed on improving the transferability of adversarial examples generated by transfer-based adversarial attacks, our investigation found that, the big deviation between the actual and steepest update directions of the current transfer-based adversarial attacks is caused by the large update step length, resulting in the generated adversarial examples can not converge well.
no code implementations • 17 Mar 2023 • Xiangyuan Yang, Jie Lin, HANLIN ZHANG, Xinyu Yang, Peng Zhao
In this paper, we first systematically investigated this issue and found that the enormous difference of attack success rates between the surrogate model and victim model is caused by the existence of a special area (known as fuzzy domain in our paper), in which the adversarial examples in the area are classified wrongly by the surrogate model while correctly by the victim model.
no code implementations • 7 Mar 2023 • Zhaoxi Mu, Xinyu Yang, Wenjing Zhu
Specifically, we design a new network SE-Conformer that can model audio sequences in multiple dimensions and scales, and apply it to the dual-path speech separation framework.
no code implementations • 7 Mar 2023 • Zhaoxi Mu, Xinyu Yang, Xiangyuan Yang, Wenjing Zhu
In noisy and reverberant environments, the performance of deep learning-based speech separation methods drops dramatically because previous methods are not designed and optimized for such situations.
2 code implementations • 22 Feb 2023 • Chengxi Zeng, Xinyu Yang, David Smithard, Majid Mirmehdi, Alberto M Gambaruto, Tilo Burghardt
This paper presents a deep learning framework for medical video segmentation.
no code implementations • 6 Feb 2023 • Huaxiu Yao, Xinyu Yang, Xinyi Pan, Shengchao Liu, Pang Wei Koh, Chelsea Finn
Distribution shift presents a significant challenge in machine learning, where models often underperform during the test stage when faced with a different distribution than the one they were trained on.
no code implementations • CVPR 2023 • Zhijian Liu, Xinyu Yang, Haotian Tang, Shang Yang, Song Han
Transformer, as an alternative to CNN, has been proven effective in many modalities (e. g., texts and images).
no code implementations • 10 Nov 2022 • Xinyu Yang, Haoyuan Liu, Ziyu Wang, Peng Gao
System auditing has emerged as a key approach for monitoring system call events and investigating sophisticated attacks.
no code implementations • 4 Nov 2022 • Jingchang Zhuge, Huiyuan Liang, Yiming Zhang, Shichao Li, Xinyu Yang, Jun Wu
Aircraft taxiing conflict is a threat to the safety of airport operations, mainly due to the human error in control command infor-mation.
1 code implementation • 25 Oct 2022 • Xinyu Yang, Huaxiu Yao, Allan Zhou, Chelsea Finn
We study this multi-domain long-tailed learning problem and aim to produce a model that generalizes well across all classes and domains.
no code implementations • 14 Sep 2022 • Hang Chen, Keqing Du, Xinyu Yang, Chenguang Li
Understanding causality helps to structure interventions to achieve specific goals and enables predictions under interventions.
no code implementations • 29 Aug 2022 • Hang Chen, Xinyu Yang, Xiang Li
To learn it applicably, we propose a general clause-level encoding model named EA-GAT comprising E-GAT and Activation Sort.
2 code implementations • 17 Aug 2022 • Chengxi Zeng, Xinyu Yang, Majid Mirmehdi, Alberto M Gambaruto, Tilo Burghardt
Our findings suggest that the proposed model can indeed enhance the TransUNet architecture via exploiting temporal information and improving segmentation performance by a significant margin.
no code implementations • 7 Aug 2022 • Feixiang Zhou, Xinyu Yang, Fang Chen, Long Chen, Zheheng Jiang, Hui Zhu, Reiko Heckel, Haikuan Wang, Minrui Fei, Huiyu Zhou
Furthermore, we design a novel Interaction-Aware Transformer (IAT) to dynamically learn the graph-level representation of social behaviours and update the node-level representation, guided by our proposed interaction-aware self-attention mechanism.
no code implementations • 2 Jun 2022 • Xiangyuan Yang, Jie Lin, HANLIN ZHANG, Xinyu Yang, Peng Zhao
The empirical and theoretical analysis demonstrates that the MDL loss improves the robustness and generalization of the model simultaneously for natural training.
no code implementations • 2 Jun 2022 • Xiangyuan Yang, Jie Lin, HANLIN ZHANG, Xinyu Yang, Peng Zhao
To enhance the robustness of the classifier, in our paper, a \textbf{F}eature \textbf{A}nalysis and \textbf{C}onditional \textbf{M}atching prediction distribution (FACM) model is proposed to utilize the features of intermediate layers to correct the classification.
1 code implementation • 26 May 2022 • Zhijian Liu, Haotian Tang, Alexander Amini, Xinyu Yang, Huizi Mao, Daniela Rus, Song Han
Multi-sensor fusion is essential for an accurate and reliable autonomous driving system.
Ranked #4 on 3D Object Detection on nuScenes
no code implementations • 19 May 2022 • Xiangyuan Yang, Jie Lin, HANLIN ZHANG, Xinyu Yang, Peng Zhao
Specifically, we propose a gradient aligned mechanism to ensure that the derivatives of the loss function with respect to the logit vector have the same weight coefficients between the surrogate and victim models.
1 code implementation • 4 May 2022 • Adalberto Claudio Quiros, Nicolas Coudray, Anna Yeaton, Xinyu Yang, Bojing Liu, Hortense Le, Luis Chiriboga, Afreen Karimkhan, Navneet Narula, David A. Moore, Christopher Y. Park, Harvey Pass, Andre L. Moreira, John Le Quesne, Aristotelis Tsirigos, Ke Yuan
Definitive cancer diagnosis and management depend upon the extraction of information from microscopy images by pathologists.
1 code implementation • 30 Apr 2022 • Xinyu Yang, Tilo Burghardt, Majid Mirmehdi
We propose a novel end-to-end curriculum learning approach for sparsely labelled animal datasets leveraging large volumes of unlabelled data to improve supervised species detectors.
no code implementations • 26 Apr 2022 • Rufan Bai, Haoxing Lin, Xinyu Yang, Xiaowei Wu, Minming Li, Weijia Jia
In this work, we initiate the study of mixed strategies for the security games in which the targets can have different defending requirements.
1 code implementation • 24 Mar 2022 • Luhui Wang, Cong Zhao, Shusen Yang, Xinyu Yang, Julie McCann
Intelligent applications based on machine learning are impacting many parts of our lives.
no code implementations • 23 Aug 2021 • Xinyu Yang, Xinlan Zhang, Zhenguo Zhang, Yahui Zhao, Rongyi Cui
In order to reasonably measure the distance of the time series, DTW, which has been verified to be an effective method forts, is employed as the distance metric.
no code implementations • 20 Apr 2021 • Zhaoxi Mu, Xinyu Yang, Yizhuo Dong
As an indispensable part of modern human-computer interaction system, speech synthesis technology helps users get the output of intelligent machine more easily and intuitively, thus has attracted more and more attention.
no code implementations • 1 Apr 2021 • Yuka Takeishi, Mingxuan Niu, Jing Luo, Zhong Jin, Xinyu Yang
To further explore the creative potential of natural language generation systems in Japanese poetry creation, we propose a novel Waka generation model, WakaVT, which automatically produces Waka poems given user-specified keywords.
no code implementations • 13 Nov 2020 • Shulei Ji, Jing Luo, Xinyu Yang
This paper attempts to provide an overview of various composition tasks under different music generation levels, covering most of the currently popular music generation tasks using deep learning.
1 code implementation • 14 Oct 2020 • Xinyu Yang, Majid Mirmehdi, Tilo Burghardt
In this paper we show that learning video feature spaces in which temporal cycles are maximally predictable benefits action classification.
no code implementations • 9 Oct 2020 • Fangyuan Zhao, Xuebin Ren, Shusen Yang, Qing Han, Peng Zhao, Xinyu Yang
To address the privacy issue in LDA, we systematically investigate the privacy protection of the main-stream LDA training algorithm based on Collapsed Gibbs Sampling (CGS) and propose several differentially private LDA algorithms for typical training scenarios.
1 code implementation • 16 Jun 2020 • Haoxing Lin, Rufan Bai, Weijia Jia, Xinyu Yang, Yongjian You
To filter out irrelevant noises and alleviate the error propagation, DSAN dynamically extracts valuable information by applying self-attention over the noisy input and bridges each output directly to the purified inputs via implementing a switch-attention mechanism.
no code implementations • 22 Apr 2020 • Qing Han, Shusen Yang, Xuebin Ren, Cong Zhao, Jingqi Zhang, Xinyu Yang
However, heterogeneous and limited computation and communication resources on edge servers (or edges) pose great challenges on distributed ML and formulate a new paradigm of Edge Learning (i. e. edge-cloud collaborative machine learning).
no code implementations • 27 Nov 2019 • Jun Zhao, Teng Wang, Tao Bai, Kwok-Yan Lam, Zhiying Xu, Shuyu Shi, Xuebin Ren, Xinyu Yang, Yang Liu, Han Yu
Although both classical Gaussian mechanisms [1, 2] assume $0 < \epsilon \leq 1$, our review finds that many studies in the literature have used the classical Gaussian mechanisms under values of $\epsilon$ and $\delta$ where the added noise amounts of [1, 2] do not achieve $(\epsilon,\delta)$-DP.
no code implementations • 29 Sep 2019 • Jing Luo, Xinyu Yang, Shulei Ji, Juan Li
In this paper, we propose MG-VAE, a music generative model based on VAE (Variational Auto-Encoder) that is capable of capturing specific music style and generating novel tunes for Chinese folk songs (Min Ge) in a manipulatable way.
Music Generation Multimedia Sound Audio and Speech Processing
no code implementations • 29 Aug 2019 • Andrey Kormilitzin, Xinyu Yang, William H. Stone, Caroline Woffindale, Francesca Nicholls, Elena Ribe, Alejo Nevado-Holgado, Noel Buckley
Understanding the morphological changes of primary neuronal cells induced by chemical compounds is essential for drug discovery.
no code implementations • 29 Aug 2019 • Xinyu Yang, Majid Mirmehdi, Tilo Burghardt
We propose the first multi-frame video object detection framework trained to detect great apes.
no code implementations • 5 Jun 2019 • Yanan Li, Xuebin Ren, Shusen Yang, Xinyu Yang
Considering general correlations, a closed-form expression of privacy leakage is derived for continuous data, and a chain rule is presented for discrete data.
no code implementations • 4 Jun 2019 • Teng Wang, Jun Zhao, Han Yu, Jinyan Liu, Xinyu Yang, Xuebin Ren, Shuyu Shi
To investigate such ethical dilemmas, recent studies have adopted preference aggregation, in which each voter expresses her/his preferences over decisions for the possible ethical dilemma scenarios, and a centralized system aggregates these preferences to obtain the winning decision.
no code implementations • 4 Jun 2019 • Fangyuan Zhao, Xuebin Ren, Shusen Yang, Xinyu Yang
Latent Dirichlet Allocation (LDA) is a popular topic modeling technique for discovery of hidden semantic architecture of text datasets, and plays a fundamental role in many machine learning applications.
1 code implementation • PLOS ONE 2017 • Xinyu Yang, Guoai Xu, Qi Li, Yanhui Guo, Miao Zhang
Then these metrics are input to neural network for supervised learning, the weights of which are output by PSO and BP hybrid algorithm.