Search Results for author: Patrick Cardinal

Found 30 papers, 7 papers with code

Recursive Joint Attention for Audio-Visual Fusion in Regression based Emotion Recognition

1 code implementation17 Apr 2023 R Gnana Praveen, Eric Granger, Patrick Cardinal

In video-based emotion recognition (ER), it is important to effectively leverage the complementary relationship among audio (A) and visual (V) modalities, while retaining the intra-modal characteristics of individual modalities.

Emotion Recognition regression

Audio-Visual Fusion for Emotion Recognition in the Valence-Arousal Space Using Joint Cross-Attention

1 code implementation19 Sep 2022 R Gnana Praveen, Eric Granger, Patrick Cardinal

In this paper, we focus on dimensional ER based on the fusion of facial and vocal modalities extracted from videos, where complementary audio-visual (A-V) relationships are explored to predict an individual's emotional states in valence-arousal space.

Emotion Recognition

RSD-GAN: Regularized Sobolev Defense GAN Against Speech-to-Text Adversarial Attacks

no code implementations14 Jul 2022 Mohammad Esmaeilpour, Nourhene Chaalia, Patrick Cardinal

This paper introduces a new synthesis-based defense algorithm for counteracting with a varieties of adversarial attacks developed for challenging the performance of the cutting-edge speech-to-text transcription systems.

From Environmental Sound Representation to Robustness of 2D CNN Models Against Adversarial Attacks

no code implementations14 Apr 2022 Mohammad Esmaeilpour, Patrick Cardinal, Alessandro Lameiras Koerich

This paper investigates the impact of different standard environmental sound representations (spectrograms) on the recognition performance and adversarial attack robustness of a victim residual convolutional neural network, namely ResNet-18.

Adversarial Attack Adversarial Robustness +1

A Joint Cross-Attention Model for Audio-Visual Fusion in Dimensional Emotion Recognition

1 code implementation28 Mar 2022 Gnana Praveen Rajasekar, Wheidima Carneiro de Melo, Nasib Ullah, Haseeb Aslam, Osama Zeeshan, Théo Denorme, Marco Pedersoli, Alessandro Koerich, Simon Bacon, Patrick Cardinal, Eric Granger

Specifically, we propose a joint cross-attention model that relies on the complementary relationships to extract the salient features across A-V modalities, allowing for accurate prediction of continuous values of valence and arousal.

Multimodal Emotion Recognition

Bi-Discriminator Class-Conditional Tabular GAN

no code implementations12 Nov 2021 Mohammad Esmaeilpour, Nourhene Chaalia, Adel Abusitta, Francois-Xavier Devailly, Wissem Maazoun, Patrick Cardinal

This paper introduces a bi-discriminator GAN for synthesizing tabular datasets containing continuous, binary, and discrete columns.

Benchmarking

Cross Attentional Audio-Visual Fusion for Dimensional Emotion Recognition

1 code implementation9 Nov 2021 Gnana Praveen R, Eric Granger, Patrick Cardinal

Results indicate that our cross-attentional A-V fusion model is a cost-effective approach that outperforms state-of-the-art fusion approaches.

Multimodal Emotion Recognition

Towards Robust Speech-to-Text Adversarial Attack

no code implementations15 Mar 2021 Mohammad Esmaeilpour, Patrick Cardinal, Alessandro Lameiras Koerich

This paper introduces a novel adversarial algorithm for attacking the state-of-the-art speech-to-text systems, namely DeepSpeech, Kaldi, and Lingvo.

Adversarial Attack Room Impulse Response (RIR) +1

Multi-Discriminator Sobolev Defense-GAN Against Adversarial Attacks for End-to-End Speech Systems

no code implementations15 Mar 2021 Mohammad Esmaeilpour, Patrick Cardinal, Alessandro Lameiras Koerich

This paper introduces a defense approach against end-to-end adversarial attacks developed for cutting-edge speech-to-text systems.

Weakly Supervised Learning for Facial Behavior Analysis : A Review

no code implementations25 Jan 2021 Gnana Praveen R, Eric Granger, Patrick Cardinal

In this paper, we provide a comprehensive review of weakly supervised learning (WSL) approaches for facial behavior analysis with both categorical as well as dimensional labels along with the challenges and potential research directions associated with it.

Weakly-supervised Learning

Deep DA for Ordinal Regression of Pain Intensity Estimation Using Weakly-Labeled Videos

no code implementations28 Oct 2020 Gnana Praveen R, Eric Granger, Patrick Cardinal

The WSDA-OR model enforces ordinal relationships among the intensity levels as-signed to the target sequences, and associates multiple relevant frames to sequence-level labels (instead of a single frame).

Domain Adaptation regression +1

Class-Conditional Defense GAN Against End-to-End Speech Attacks

no code implementations22 Oct 2020 Mohammad Esmaeilpour, Patrick Cardinal, Alessandro Lameiras Koerich

In this paper we propose a novel defense approach against end-to-end adversarial attacks developed to fool advanced speech-to-text systems such as DeepSpeech and Lingvo.

Generative Adversarial Network Sentence

Conditioning Trick for Training Stable GANs

no code implementations12 Oct 2020 Mohammad Esmaeilpour, Raymel Alfonso Sallo, Olivier St-Georges, Patrick Cardinal, Alessandro Lameiras Koerich

In this paper we propose a conditioning trick, called difference departure from normality, applied on the generator network in response to instability issues during GAN training.

Adversarially Training for Audio Classifiers

no code implementations26 Aug 2020 Raymel Alfonso Sallo, Mohammad Esmaeilpour, Patrick Cardinal

In this paper, we investigate the potential effect of the adversarially training on the robustness of six advanced deep neural networks against a variety of targeted and non-targeted adversarial attacks.

Benchmarking

Improving Stability of LS-GANs for Audio and Speech Signals

no code implementations12 Aug 2020 Mohammad Esmaeilpour, Raymel Alfonso Sallo, Olivier St-Georges, Patrick Cardinal, Alessandro Lameiras Koerich

In this paper we address the instability issue of generative adversarial network (GAN) by proposing a new similarity metric in unitary space of Schur decomposition for 2D representations of audio and speech signals.

Generative Adversarial Network

From Sound Representation to Model Robustness

no code implementations27 Jul 2020 Mohammad Esmaeilpour, Patrick Cardinal, Alessandro Lameiras Koerich

In this paper, we investigate the impact of different standard environmental sound representations (spectrograms) on the recognition performance and adversarial attack robustness of a victim residual convolutional neural network.

Adversarial Attack Adversarial Robustness +1

Deep Weakly-Supervised Domain Adaptation for Pain Localization in Videos

no code implementations17 Oct 2019 Gnana Praveen R, Eric Granger, Patrick Cardinal

Automatic pain assessment has an important potential diagnostic value for populations that are incapable of articulating their pain experiences.

Domain Adaptation Multiple Instance Learning

Emotion Recognition with Spatial Attention and Temporal Softmax Pooling

no code implementations2 Oct 2019 Masih Aminbeidokhti, Marco Pedersoli, Patrick Cardinal, Eric Granger

Video-based emotion recognition is a challenging task because it requires to distinguish the small deformations of the human face that represent emotions, while being invariant to stronger visual differences due to different identities.

Emotion Recognition

Universal Adversarial Audio Perturbations

1 code implementation arXiv preprint 2019 Sajjad Abdoli, Luiz G. Hafemann, Jerome Rony, Ismail Ben Ayed, Patrick Cardinal, Alessandro L. Koerich

We demonstrate the existence of universal adversarial perturbations, which can fool a family of audio classification architectures, for both targeted and untargeted attack scenarios.

Audio Classification

Bag-of-Audio-Words based on Autoencoder Codebook for Continuous Emotion Prediction

no code implementations6 Jul 2019 Mohammed Senoussaoui, Patrick Cardinal, Alessandro Lameiras Koerich

The conventional BoW model is based on a dictionary (codebook) built from elementary representations which are selected randomly or by using a clustering algorithm on a training dataset.

Clustering

Emotion Recognition Using Fusion of Audio and Video Features

no code implementations25 Jun 2019 Juan D. S. Ortega, Patrick Cardinal, Alessandro L. Koerich

In this paper we propose a fusion approach to continuous emotion recognition that combines visual and auditory modalities in their representation spaces to predict the arousal and valence levels.

Emotion Recognition Transfer Learning

Speaker Sincerity Detection based on Covariance Feature Vectors and Ensemble Methods

no code implementations26 Apr 2019 Mohammed Senoussaoui, Patrick Cardinal, Najim Dehak, Alessandro Lameiras Koerich

Automatic measuring of speaker sincerity degree is a novel research problem in computational paralinguistics.

A Robust Approach for Securing Audio Classification Against Adversarial Attacks

no code implementations24 Apr 2019 Mohammad Esmaeilpour, Patrick Cardinal, Alessandro Lameiras Koerich

In this paper we first review some strong adversarial attacks that may affect both audio signals and their 2D representations and evaluate the resiliency of the most common machine learning model, namely deep learning models and support vector machines (SVM) trained on 2D audio representations such as short time Fourier transform (STFT), discrete wavelet transform (DWT) and cross recurrent plot (CRP) against several state-of-the-art adversarial attacks.

Audio Classification BIG-bench Machine Learning +2

End-to-End Environmental Sound Classification using a 1D Convolutional Neural Network

3 code implementations18 Apr 2019 Sajjad Abdoli, Patrick Cardinal, Alessandro Lameiras Koerich

In this paper, we present an end-to-end approach for environmental sound classification based on a 1D Convolution Neural Network (CNN) that learns a representation directly from the audio signal.

Ranked #6 on Environmental Sound Classification on UrbanSound8K (Accuracy metric, using extra training data)

Environmental Sound Classification General Classification +1

Unsupervised Feature Learning for Environmental Sound Classification Using Weighted Cycle-Consistent Generative Adversarial Network

no code implementations8 Apr 2019 Mohammad Esmaeilpour, Patrick Cardinal, Alessandro Lameiras Koerich

In this paper we propose a novel environmental sound classification approach incorporating unsupervised feature learning from codebook via spherical $K$-Means++ algorithm and a new architecture for high-level data augmentation.

Benchmarking Classification +5

Cannot find the paper you are looking for? You can Submit a new open access paper.