I have a data frame containing number of page views per user, per week. I want to determine, for each user, whether their views increased, decreased, or stayed the same after a certain event. My data looks like this:
Userid week xeventinweek numviews
Alice 1 2 5
Alice 2 0 3
Alice 4 1 6
Bob 2 2 3
Bob 3 0 5
So in this case, Alice's views decreased after she had 2 events in week 1, and she had no events in week 2 to measure by. Bob, however, increased his views from 3 to 5 the week after he had two events.
I would like to get a table with the difference in views for every week that had at least one event. So it should look something like this:
Userid week xeventinweek numviews numnextweek difference
Alice 1 2 5 3 -2
Alice 4 1 6 NA NA #the row for week 2 is missing because there were no events then for Alice
Bob 2 2 3 5 2
It is not essential to have both the numnextweek and difference columns - either or is ok.
I was able to do this using data.table and a for loop, but it took so long to run that it wasn't feasible. I thought of using a rolling join, but it doesn't seem possible with grouped data (i.e. it would need to be done individually for each Userid.) How can I do this using data.table's native functionality?