Creating a new dataset of hidden state probabilities using a HMM results in different shapes after each run

Question

I'm trying to create a new dataset of hidden state probabilities using a hidden Markov model. Everything works fine unless each time the output dataset comes up with different values (sometimes the same values) for hidden_states_train and hidden_states_test hence resulting a different column sizes in the columns stack/ a feature mismatch. e.g New dataset size (15261, 197) (5087, 194), New dataset size (15261, 197) (5087, 197) etc.

I can't figure out why this is happening each time I run the code. I tried to give same number of samples for both X_train_st and X_test_st but this keeps happening. If I set n_comp in range a smaller range e.g for n_comp in range(1,6) then often it results the same shapes.

Can someone shed some light to what's going on and a possible fix, please?

newX = X_train_st
newXtest = X_test_st

for n_comp in range(1,16):
    print("fitting to HMM and decoding %d ..." % n_comp , end="")
    modelHMM = GaussianHMM(n_components=n_comp, covariance_type="diag").fit(X_train_st)

    hidden_states_train = to_categorical(modelHMM.predict(X_train_st))
    hidden_states_test = to_categorical(modelHMM.predict(X_test_st))
    
    print("done")
    newX = np.column_stack((newX,hidden_states_train))
    newXtest = np.column_stack((newXtest,hidden_states_test))
    
print('New dataset size',newX.shape,newXtest.shape)

Can anyone help, please? Hi, marc_s thanks. Why isn't anyone viewing this question, hence no answers.. — Sarah M, Dec 26 '22 at 07:57

Creating a new dataset of hidden state probabilities using a HMM results in different shapes after each run

0 Answers0