3

I am using Mixture of Hidden Markov Model (MHMM) to cluster my data. To do so, I used Package "seqHMM" in R. My question is whether it is possible to obtain the actual observations within each cluster.

Like for example, after my analysis, I have 3 clusters, and I want to find the exact observations within each cluster, is it possible?

Example:

At first, I created three HMMs with transition probabilities initial probabilities sc_init1, sc_init2, sc_init3, and sc_trans1, sc_trans2, sc_trans3, and finally with emission probabilities sc_emiss1, sc_emiss2, sc_emiss3 respectively. Then I combined them into MHMM with three clusters as the following:

mhmm_init <- list(sc_init1, sc_init2, sc_init3)

mhmm_trans <- list(sc_trans1, sc_trans2, sc_trans3)

mhmm_emiss <- list(sc_emiss1,sc_emiss2, sc_emiss3)

mhmm<- build_mhmm(observations=seq, transition_probs=mhmm_trans, emission_probs=mhmm_emission, initial_probs=mhmm_initial, cluster_names = c("Cluster 1", "Cluster 2", "Cluster 3”))

My data, seq, is longitudinal data. Now that the model is constructed, I estimated model parameters with the fit_model function as the following

set.seed(1011) #1011

mhmm_fit <- fit_model(mhmm, local_step = TRUE, threads = 1,
                      control_em = list(restart = list(times =10)))

mhmm_final <- mhmm_fit$model

By using mhmm_final, I can get several information about each of my three clusters such as transition probabilities, initial probabilities and emission probabilities. For example, if I want to get these estimations for cluster 1 I can easily get them with the following code:

mhmm_final$transition_probs$`Cluster 1`

mhmm_final$emission_probs$`Cluster 1`

mhmm_final$initial_probs$`Cluster 1`

My question is that how I can get observations in each cluster. There is a code available for observations as mhmm_final$observations but this line of code gives me all the observations in all three clusters. I want to find the exact observations within each cluster, in this case Cluster 1.

Let’s assume that I have 10 sequences (seq 1, seq 2, seq 3, seq 4, seq 5, seq 6, seq 7, seq 8, seq 9, seq 10), and I clustered them into three groups with this approach. I want to know that each of these sequences belongs to which cluster.

Jouni Helske
  • 6,427
  • 29
  • 52
Nazanin E
  • 31
  • 2
  • 3
    Please add some minimal example to show how you have used the package. – Heikki Jan 30 '18 at 08:07
  • 1
    As my response was too long, I could not add it as comment here. So I updated my initial post instead. Please check out the question again with the example. I hope it is helpful. Thanks! – Nazanin E Jan 30 '18 at 23:41

1 Answers1

4

you can get the most probable clusters from the summary:

summary(mhmm_final)$most_probable_cluster
Amir
  • 43
  • 5
  • @NazaninE please choose the answer as the best if it solved ur question. – Hadij May 31 '18 at 15:19
  • @Amir do you know how to get the probability of being in each cluster. (instead of just having the cluster number). – Hadij May 31 '18 at 16:37