no code implementations • 1 Oct 2023 • Spandan Dey, Premjeet Singh, Goutam Saha
We show that LID requires low octave resolution and frequency-scattering is not useful.
no code implementations • 10 Feb 2023 • Spandan Dey, Md Sahidullah, Goutam Saha
Our experiments demonstrate that the proposed domain diversification is more promising over commonly used simple augmentation methods.
no code implementations • 14 Jan 2023 • Premjeet Singh, Md Sahidullah, Goutam Saha
This work explores the use of constant-Q transform based modulation spectral features (CQT-MSF) for speech emotion recognition (SER).
no code implementations • 30 Nov 2022 • Spandan Dey, Md Sahidullah, Goutam Saha
In this work, we have conducted one of the very first attempts to present a comprehensive review of the Indian spoken language recognition research field.
no code implementations • 29 Nov 2022 • Premjeet Singh, Shefali Waldekar, Md Sahidullah, Goutam Saha
This work analyzes the constant-Q filterbank-based time-frequency representations for speech emotion recognition (SER).
no code implementations • 11 May 2021 • Premjeet Singh, Goutam Saha, Md Sahidullah
We also investigate layer-wise scattering coefficients to analyse the importance of time shift and deformation stable scalogram and modulation spectrum coefficients for SER.
no code implementations • 10 May 2021 • Spandan Dey, Goutam Saha, Md Sahidullah
In this paper, we conduct one of the very first studies for cross-corpora performance evaluation in the spoken language identification (LID) problem.
no code implementations • 8 Feb 2021 • Premjeet Singh, Goutam Saha, Md Sahidullah
In this work, we explore the constant-Q transform (CQT) for speech emotion recognition (SER).
no code implementations • 25 Jan 2021 • A Kishore Kumar, Shefali Waldekar, Goutam Saha, Md Sahidullah
This report presents the system developed by the ABSP Laboratory team for the third DIHARD speech diarization challenge.
no code implementations • 21 Jul 2020 • Susanta Sarangi, Md Sahidullah, Goutam Saha
Then, we propose a new method for computing the filter frequency responses by using principal component analysis (PCA).
no code implementations • 29 Jan 2019 • Arnab Poddar, Md Sahidullah, Goutam Saha
We have used the proposed quality measures as side information for combining ASV systems based on Gaussian mixture model-universal background model (GMM-UBM) and i-vector.
no code implementations • 3 Dec 2018 • Arnab Poddar, Md Sahidullah, Goutam Saha
In experiments with the NIST SRE 2008 corpus, We have shown that inclusion of proposed quality metric exhibits considerable improvement in speaker verification performance.
no code implementations • 22 Dec 2016 • Monisankha Pal, Dipjyoti Paul, Md Sahidullah, Goutam Saha
Most of the existing studies on voice conversion (VC) are conducted in acoustically matched conditions between source and target signal.
no code implementations • 21 Aug 2015 • Sudip Mandal, Goutam Saha, Rajat K. Pal
In the next phase of this research, BA based RNN is applied to real world benchmark time series microarray dataset of E. coli.