I'm looking for an R-package to group consecutive dates into periods. In addition, columns must be grouped by FID, PID and SETTING:
# Input data
input <- read.csv(text=
"FID,PID,SETTING,DATE
00001, 100001, ST, 2021-01-01
00001, 100001, ST, 2021-01-02
00001, 100001, ST, 2021-01-03
00001, 100002, AB, 2021-01-04
00001, 100001, ST, 2021-01-11
00001, 100001, ST, 2021-01-12
00002, 200001, AB, 2021-01-02
00002, 200001, AB, 2021-01-03
00002, 200001, AB, 2021-01-04
00002, 200002, TK, 2021-01-05"
)
# Expected output
output <- read.csv(text="
FID,PID,SETTING,START,END
00001, 100001, ST, 2021-01-01, 2021-01-03
00001, 100002, AB, 2021-01-04, 2021-01-04
00001, 100001, ST, 2021-01-11, 2021-01-12
00002, 200001, AB, 2021-01-02, 2021-01-04
00002, 200002, TK, 2021-01-05, 2021-01-05"
)
I 've to group around 700'000 lines. Therefore, the solution should be as performant as possible.