0

I am trying to implement HMM in R. Right now I have 18 columns in my dataset. When I am making an emission matrix, it gives an error of undefined columns, I don't know where I am doing it wrong.

library(seqHMM)

emix <- matrix(NA, nrow = 3, ncol = 4)
emix[1,] <- seqstatf(exam_seq[, 1:5])[, 3] + 1
emix[2,] <- seqstatf(exam_seq[, 6:12])
emix[3,] <- seqstatf(exam_seq[, 13:18])

Error in \`[.data.frame`(seqstatf(exam_seq[, 1:5]), , 3) : 
  undefined columns selected

Data posted by the OP in a comment.

structure(list(Freq = c(260, 262, 74, 1, 485, 106, 6, 
219, 215, 1282, 80), Percent = c(8.69565217391304, 
8.76254180602007, 2.47491638795987, 0.0334448160535117, 
16.2207357859532, 3.54515050167224, 0.20066889632107, 
7.32441471571906, 7.19063545150502, 42.876254180602, 
2.67558528428094 )), class = "data.frame", row.names = 
c("So", "Da", "UK", "Tw", "Al", "Ab", "D", "NA", "0", "1", 
"2"))
Rui Barradas
  • 70,273
  • 8
  • 34
  • 66
  • Can you post output of `dput(seqstatf(exam_seq[, 1:5]))`? – markus Sep 22 '18 at 10:10
  • @markus**structure(list(Freq = c(260, 262, 74, 1, 485, 106, 6, 219, 215, 1282, 80), Percent = c(8.69565217391304, 8.76254180602007, 2.47491638795987, 0.0334448160535117, 16.2207357859532, 3.54515050167224, 0.20066889632107, 7.32441471571906, 7.19063545150502, 42.876254180602, 2.67558528428094 )), class = "data.frame", row.names = c("So", "Da", "UK", "Tw", "Al", "Ab", "D", "NA", "0", "1", "2"))** – Aymen Tasneem Sep 22 '18 at 10:18
  • Thats the output @markus ^ – Aymen Tasneem Sep 22 '18 at 10:18
  • You create `emix` with 3 rows and 4 columns, the dataset you have posted has 11 rows and 2 columns. Also, where does function `seqstatf` come from? – Rui Barradas Sep 22 '18 at 10:45
  • @AymenTasneem The output you posted has two columns and you want to select the third one. That's why the error. – markus Sep 22 '18 at 11:32
  • @RuiBarradas How did u know that dataset has 2 columns? I am following an example of HMM that I found on a website, they used this function **seqstatf** for emission matrices. – Aymen Tasneem Sep 22 '18 at 11:48
  • @markus These are the first 3 rows of my dataset and it has 18 columns, ` Sequence 1 So-Al-1-1-1-2-1-0-1-0-0-0-0-0-0-0-0-0 2 Da-Al-1-1-1-2-0-1-0-0-0-0-0-0-0-0-0-1 3 Da-Al-1-1-1-2-0-1-0-0-0-0-0-0-0-0-0-0 ` @RuiBarradas , – Aymen Tasneem Sep 22 '18 at 11:49
  • Your dataset has two columns, `Freq` and `Percent` and 11 rows, as can be seen by counting argument `row.names` or with base function `dim`. As for the second issue, function `seqstatf`, the question still holds, can you provide us with a link to it? – Rui Barradas Sep 22 '18 at 11:54
  • [link] (https://rdrr.io/cran/seqHMM/src/R/build_hmm.R) Here is the example that I am following, I ran this example and it gave no error, but when I try to run it on my dataset, the emission matrix is giving an error. @RuiBarradas – Aymen Tasneem Sep 22 '18 at 11:57
  • The link should be to CRAN package [seqHMM](https://cran.r-project.org/web/packages/seqHMM/index.html). – Rui Barradas Sep 22 '18 at 13:32
  • @RuiBarradas did you check the link? – Aymen Tasneem Sep 22 '18 at 18:30
  • Yes, I have. I believe that what you are doing wrong is to run the same code on your dataset, you should try to understand how the example is applied to *that* data and then adapt to your problem. – Rui Barradas Sep 22 '18 at 19:54
  • I molded my data according to this example. I have 18 columns with 0 and 1 values. the code I mentioned only works for the first row of emix matrix and not reading the rest of the columns of my dataset. The error is _"incorrect number of subscripts on matrix"_ or _"undefined columns selected"_ As I am new to R so its difficult for me to interpret the error and correct my code accordingly. @RuiBarradas – Aymen Tasneem Sep 24 '18 at 11:15
  • @RuiBarradas Is there any other way to find emission probabilities for a given dataset? – Aymen Tasneem Sep 24 '18 at 11:28

0 Answers0