I'm using R and have a data table DT
with 30 million rows:
userid, date, measurement
101, 1/1/2011, 13
101, 2/1/2011, 42
...
333, 1/1/2011, 67
...
I'm thinking of aggregating the observations by userid and week.
My current idea is to convert date into an integer, and then divide by 7 and use the floor function, creating a new variable week. Finally, I can use the
DT[,.(measurement.Sum = sum(measurement)),by=.(userid,week)]
Would this be the fastest way of doing things? (I read about the zoo library but it seems troublesome to switch between data.table and zoo library)