1

this question actually arose from this one ( Find specific patterns in sequences ) that I've asked myself, but I believe it is a separate issue.

Following the response by Gilbert I tried to create a events sequence from a states sequence, but I've encountered a problem.

The suggestion was to use

seqe=seqecreate(comp.seq,tevent="state")

to then use

seqefsub(seqe,strsubseq="(a)-(d)")

But when I try to use seqecreate() I get the following error:

Error in `seqelength<-`(`*tmp*`, value = c(64, 64, 64, 64, 61, 62, 61,  : (...) 
s and len should be of the same size.

the same happens if I try to convert it to an events sequence using:

seqe=seqecreate(comp.seq,tevent="transition")

Trying with subsets of rows and identifying which rows were causing the problem I found out that the problematic rows all are in a constant state, which means they actually have no transitions, they remain in the same stater permantely (e.g. A-A-A-A-A-A).

So my question is:

  1. Is there any flag or whatsoever that I can set up to be able to make the conversion ?
  2. If not how can I delete those rows given that they have different lengths an missing values. for instance I may have sequences like:

    missing-missing-A-A-A-A A-A-missing-missing-missing-missing-missing

Thanks a lot in advance !

Providing a sample of my data:

comp.seq <- seqdef(comp,NULL,states=comp.scodes,labels=comp.labels, alphabet=comp.alphabet,missing="Z") comp.seq[1:7,] 1 *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-B-B-B-B-B-B-B-B-D-D-D-D-D-A-A-A-A-A-A-A-A-A 2 *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-C-C-C-C-C-C-C-C-C-C-C-C-C-C-*-B-B-B-B-B-B-B-B-B-B-B-B-B-A-A-A-A-A-A 3 *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-A-A-A-C-C-A-A-A-A-A-A-A-D-D-A-A-A-A-A-A-A-A 4 *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-B-B-B-B-B-B-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A 5 *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-C-C-D-D-D-D-D-D-D-D-D-D-A-A-A-A-A 6 *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-B-B-B-B-B-B-B-B-B-B-B-B-B-D-D-D-D-D-D-D-D-A-A-A-A 7 *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-A-A-A-A-A-A-A-A-A-A-A-A

row #7 is a problematic one. If I try to use seqecreate(comp.seq[1:6,]) it works

Community
  • 1
  • 1
Pedro Braz
  • 2,261
  • 3
  • 25
  • 48
  • We cannot see from your example what StatesSequence is. Please, provide a minimal example showing the original data and the creation of the state sequence object with `seqdef`. – Gilbert Jan 23 '15 at 19:39

3 Answers3

3

The error occurs when there are missing states and the sequences are of different length. A workaround is to set right="NA" in the seqdef call.

Here is a minimal example:

x1 <- "*-*-A-B"
x2 <- "*-A-A"
dat.str <- data.frame(string=rbind(x1,x2))
dat <- seqdecomp(dat.str, sep="-", miss="*")

## creating state sequence object with and without right="NA"
dat.seq.NA <- seqdef(dat, right="NA")
dat.seq.void <- seqdef(dat)

## next command works without error
dat.eseq <- seqecreate(dat.seq.NA, tevent="state")

## while this one produces the error
dat.eseq <- seqecreate(dat.seq.void, tevent="state")
Gilbert
  • 3,570
  • 18
  • 28
1

So in the sequence I used I set a code for missing values, the missing="Z" option in the seqdef() function.

I managed to make it work by not setting the missing option and creating a "dummy" state Z, that I added to the alphabet and a label "Z-missing". Also I set the options left="Z" and right="Z" .

still looks like a bug to me though.

Pedro Braz
  • 2,261
  • 3
  • 25
  • 48
0

I'm not sure if this might be the answer but on this Cran Page, http://cran.r-project.org/web/packages/TraMineR/NEWS, that they talk about the development version 1.9.8 of TraMineR they refer to a bug:

Bug fixes: - seqformat(): When converting from STS to TSE, an error was raised if the tevent matrix had empty strings (i.e. ""). Now, this is considered as no event.

It is not exactly the case since the sequence is not empty per se but it might be the same issue. I believe they use the seqformat() function internally and therefore the bug might me somehow related.

I'll download the development version and post her how it goes.

Pedro Braz
  • 2,261
  • 3
  • 25
  • 48