this question actually arose from this one ( Find specific patterns in sequences ) that I've asked myself, but I believe it is a separate issue.
Following the response by Gilbert I tried to create a events sequence from a states sequence, but I've encountered a problem.
The suggestion was to use
seqe=seqecreate(comp.seq,tevent="state")
to then use
seqefsub(seqe,strsubseq="(a)-(d)")
But when I try to use seqecreate() I get the following error:
Error in `seqelength<-`(`*tmp*`, value = c(64, 64, 64, 64, 61, 62, 61, : (...)
s and len should be of the same size.
the same happens if I try to convert it to an events sequence using:
seqe=seqecreate(comp.seq,tevent="transition")
Trying with subsets of rows and identifying which rows were causing the problem I found out that the problematic rows all are in a constant state, which means they actually have no transitions, they remain in the same stater permantely (e.g. A-A-A-A-A-A).
So my question is:
- Is there any flag or whatsoever that I can set up to be able to make the conversion ?
If not how can I delete those rows given that they have different lengths an missing values. for instance I may have sequences like:
missing-missing-A-A-A-A A-A-missing-missing-missing-missing-missing
Thanks a lot in advance !
Providing a sample of my data:
comp.seq <- seqdef(comp,NULL,states=comp.scodes,labels=comp.labels, alphabet=comp.alphabet,missing="Z")
comp.seq[1:7,]
1 *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-B-B-B-B-B-B-B-B-D-D-D-D-D-A-A-A-A-A-A-A-A-A
2 *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-C-C-C-C-C-C-C-C-C-C-C-C-C-C-*-B-B-B-B-B-B-B-B-B-B-B-B-B-A-A-A-A-A-A
3 *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-A-A-A-C-C-A-A-A-A-A-A-A-D-D-A-A-A-A-A-A-A-A
4 *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-B-B-B-B-B-B-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A
5 *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-C-C-D-D-D-D-D-D-D-D-D-D-A-A-A-A-A
6 *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-B-B-B-B-B-B-B-B-B-B-B-B-B-D-D-D-D-D-D-D-D-A-A-A-A
7 *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-A-A-A-A-A-A-A-A-A-A-A-A
row #7 is a problematic one. If I try to use seqecreate(comp.seq[1:6,])
it works