Scalable Hybrid HMM with Gaussian Process Emission for Sequential Time-series Data Clustering

7 Jan 2020  ·  Yohan Jung, Jinkyoo Park ·

Hidden Markov Model (HMM) combined with Gaussian Process (GP) emission can be effectively used to estimate the hidden state with a sequence of complex input-output relational observations. Especially when the spectral mixture (SM) kernel is used for GP emission, we call this model as a hybrid HMM-GPSM. This model can effectively model the sequence of time-series data. However, because of a large number of parameters for the SM kernel, this model can not effectively be trained with a large volume of data having (1) long sequence for state transition and 2) a large number of time-series dataset in each sequence. This paper proposes a scalable learning method for HMM-GPSM. To effectively train the model with a long sequence, the proposed method employs a Stochastic Variational Inference (SVI) approach. Also, to effectively process a large number of data point each time-series data, we approximate the SM kernel using Reparametrized Random Fourier Feature (R-RFF). The combination of these two techniques significantly reduces the training time. We validate the proposed learning method in terms of its hidden-sate estimation accuracy and computation time using large-scale synthetic and real data sets with missing values.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods