I've got a data set, called vistsPerDay, that looks like this but with 405,890 rows and 10,406 unique CUST_ID:
> CUST_ID Date
> 1 2013-09-19
> 1 2013-10-03
> 1 2013-10-08
> 1 2013-10-12
> 1 2013-10-20
> 1 2013-10-25
> 1 2013-11-01
> 1 2013-11-02
> 1 2013-11-08
> 1 2013-11-15
> 1 2013-11-23
> 1 2013-12-02
> 1 2013-12-04
> 1 2013-12-09
> 2 2013-09-16
> 2 2013-09-17
> 2 2013-09-18
What I'd like to do is create a new variable that is the lagged difference between the dates in their visits. Here is the code I'm currently using:
visitsPerDay <- visitsPerDay[order(visitsPerDay$CUST_ID), ]
cust_id <- 0
for (i in 1:nrow(visitsPerDay)) {
if (visitsPerDay$CUST_ID[i] != cust_id) {
cust_id <- visitsPerDay$CUST_ID[i]
visitsPerDay$MTBV <- NA
} else {
visitsPerDay$MBTV <- as.numeric(visitsPerDay$Date[i] - visitsPerDay$Date[i-1])
}
}
I feel like this is certainly not the most efficient way to do this. Does anyone have a better way to approach it? Thanks!