I'm trying to answer the question of how many observations have a certain sequence. for example my data looks like this:
CaseNo Code
1111 RF
1111 AV
1111 RF
2222 AV
2222 TF
3333 AV
and I need an output of:
CaseNo 1 2 3
1111 RF AV RF
2222 AV RF NULL
3333 AV NULL NULL
so I can match the sequences I am after.
I've sorted the data so the codes are in the correct order but I need to combine these codes into one line.
I've read that the spread function is a good method but this seems to only work on values but my code column is characters. I've also tried the group_by and nest functions using:
spread_df <- df %>% group_by(CaseNo) %>% nest()
and this brings it back as CaseNo, Code but the codes in Code column are in a list that I can't seem to split.
Is there a way to simply bring back the output I need? Or any other method I am overlooking in discovering sequences?
Thanks