1

I'm trying to see the cases that have a specific sub sequence or sub string. In the user's guide section 10.5.1-10.5.2 it specifies how to find cases with a specific sub sequence:

mysubseqstr <- character(2)
mysubseqstr[1] <- "(Parent)-(Left)-(Left+Marr)"
mysubseqstr[2] <- "(Parent)-(Left+Marr)"
mysubseq <- seqefsub(bf.seqestate, strsubseq = mysubseqstr)
print(mysubseq)

Subsequence Support Count
1 (Parent)-(Left+Marr) 0.4870 974
2 (Parent)-(Left)-(Left+Marr) 0.2275 455
Computed on 2000 event sequences
Constraint Value
countMethod One by sequence

msubcount <- seqeapplysub(mysubseq, method = "count")
msubcount[1:3, ]

Following another question answered here (Find specific patterns in sequences, I can list the sequences that contain the sub sequence:

rownames(msubcount)[msubcount[,1]==1]

but I can't figure out how to get a list of id's (defined with the id= option in the seqdef function) have this sub sequence.

Gilbert
  • 3,570
  • 18
  • 28
user3315563
  • 495
  • 2
  • 5
  • 10

1 Answers1

0

The ids passed with the id= argument of seqdef are used as row names for the state sequence object. Assuming your state sequence object is seq and that you created the event sequence object bf.seqestate from it (which you do not show in your code!), you get the ids of the sequences containing your sub sequences with:

rownames(seq)[msubcount[,1] > 0]

Notice, that I use here > because with method = "count" in seqeapplysub you get the number of occurrences of the sub sequence in each sequence, which can be greater than 1.

Gilbert
  • 3,570
  • 18
  • 28