0

Consider a data frame ordered by a column with dates:

df=data.frame(event=1:12,
              subject=rep("M325",12),
              date=c(rep("2017-11-01",4),rep("2017-11-14",8)))

What I want is to create a fourth column with a sequence from 1 to the next unique date, with every element in the sequence repeated every i-th date. For example:

   event subject       date num
1      1    M325 2017-11-01   1
2      2    M325 2017-11-01   1
3      3    M325 2017-11-01   1
4      4    M325 2017-11-01   1
5      5    M325 2017-11-14   2
6      6    M325 2017-11-14   2
7      7    M325 2017-11-14   2
8      8    M325 2017-11-14   2
9      9    M325 2017-11-14   2
10    10    M325 2017-11-14   2
11    11    M325 2017-11-14   2
12    12    M325 2017-11-14   2

Any advice to get this result for n dates will be very appreciated.

Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
jealcalat
  • 147
  • 9
  • Try `df$num <- cumsum(!duplicated(df$date))` if it is ordered, and if it is based on the blocks of 'date' which can occur later again and then the `num` also changes, `setDT(df)[, num := rleid(date)]` – akrun Dec 19 '17 at 06:02
  • 1
    @akrun Wow, that was fast! The first worked perfectly! Thanks – jealcalat Dec 19 '17 at 06:11

1 Answers1

1

Despite the answer by @akrun

df$num <-cumsum(!duplicated(df$date))

Or using data.table:

setDT(df)[, num := rleid(date)]

Are faster, this answer using rle can actually solve my problem.

jealcalat
  • 147
  • 9