0

I am trying to generate this table as one of the inputs to a test.

        id                 diff          d
 1:      1                    2 2020-07-31
 2:      1                    1 2020-08-01
 3:      1                    1 2020-08-02
 4:      1                    1 2020-08-03
 5:      1                    1 2020-08-04
 6:      2                    2 2020-07-31
 7:      2                    1 2020-08-01
 8:      2                    1 2020-08-02
 9:      2                    1 2020-08-03
10:      2                    1 2020-08-04
11:      3                    2 2020-07-31
12:      3                    1 2020-08-01
13:      3                    1 2020-08-02
14:      3                    1 2020-08-03
15:      3                    1 2020-08-04
16:      4                    2 2020-07-31
17:      4                    1 2020-08-01
18:      4                    1 2020-08-02
19:      4                    1 2020-08-03
20:      4                    1 2020-08-04
21:      5                    2 2020-07-31
22:      5                    1 2020-08-01
23:      5                    1 2020-08-02
24:      5                    1 2020-08-03
25:      5                    1 2020-08-04
        id                 diff          d

I have done it like this -

input1 = data.table(id=as.character(1:5), diff=1)
input1 = input1[,.(d=seq(as.Date('2020-07-31'), by='days', length.out = 5)),.(id, diff)]
input1[d == '2020-07-31']$diff = 2

diff is basically the number of days to the next weekday. Eg. 31st Jul 2020 is Friday. Hence diff is 2 which is the diff to the next weekday, Monday. For the others it will be 1.

  • Is there a more R idiomatic way of doing this ?

I personally dont like that I had to generate the date sequence for each of the ids separately or the hardcoding of the diff that I have to do in the input for 31st July. Is there a more generic way of doing this without the hardcoding?

markus
  • 25,843
  • 5
  • 39
  • 58
leoOrion
  • 1,833
  • 2
  • 26
  • 52

1 Answers1

3

We can create all combination of dates and id using crossing and create diff column based on whether the weekday is "Friday".

library(dplyr)

tidyr::crossing(id = 1:5, d = seq(as.Date('2020-07-31'), 
                          by='days', length.out = 5)) %>%
    mutate(diff = as.integer(weekdays(d) == 'Friday') + 1)

Similar logic using base R expand.grid :

transform(expand.grid(id = 1:5, 
                      d = seq(as.Date('2020-07-31'), by='days', length.out = 5)), 
          diff = as.integer(weekdays(d) == 'Friday') + 1)

and CJ in data.table :

library(data.table)
df <- CJ(id = 1:5, d = seq(as.Date('2020-07-31'), by='days', length.out = 5))
df[, diff := as.integer(weekdays(d) == 'Friday') + 1]
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
  • Thanks. The CJ worked. Out of curiosity I did a class(df) on the above output. It shows `[1] "data.table" "data.frame"`. Why does it show both ? – leoOrion Jul 31 '20 at 09:23
  • 1
    You can have more than 1 class for an object and I think for `data.table` it always shows both of them. – Ronak Shah Jul 31 '20 at 09:33