0

I have a data.table in R that appears like

    V=data.table(person=c(1,1,2,3),task=c("A","A","B","C"),scores=c("Low|Low|High|Medium","Low","High|Low",""))
    > V
      person task              scores
    1:     1    A Low|Low|High|Medium
    2:     1    A                 Low
    3:     2    B            High|Low
    4:     3    C           

I would like to extract all the elements between the |, expanding length-wise, and then adding a counter by person and task. By doing

> V[, list(score = unlist(strsplit(scores, "\\|"))), by = .(rater, task)]
   rater task  score
1:     1    A    Low
2:     1    A    Low
3:     1    A   High
4:     1    A Medium
5:     1    A    Low
6:     2    B   High
7:     2    B    Low

I can get the expansion, but I am unable to add a counter column which would show

       rater task  score  counter
    1:     1    A    Low     0
    2:     1    A    Low     1
    3:     1    A   High     2
    4:     1    A Medium     3
    5:     1    A    Low     4
    6:     2    B   High     0
    7:     2    B    Low     1

Each attempt I have is futile when there are missing scores.

user321627
  • 2,350
  • 4
  • 20
  • 43

1 Answers1

1

We can sequence along .N within each group using rowid:

V[, list(score = unlist(strsplit(scores, "\\|"))), by = .(person, task)
  ][, counter := rowid(person, task) - 1][]
#    person   task  score counter
#     <num> <char> <char>   <num>
# 1:      1      A    Low       0
# 2:      1      A    Low       1
# 3:      1      A   High       2
# 4:      1      A Medium       3
# 5:      1      A    Low       4
# 6:      2      B   High       0
# 7:      2      B    Low       1

The use of rowid(.) - 1 is akin to seq_len(.N) - 1 but more data.table-canonical (thanks @zx8754 for the suggestion!).

r2evans
  • 141,215
  • 6
  • 77
  • 149