Here is a simplified version of what my data set looks like:
> df
ID total_sleep sleep_end_date
1 1 9 2017-09-03
2 1 8 2017-09-04
3 1 7 2017-09-05
4 1 10 2017-09-06
5 1 11 2017-09-07
6 2 5 2017-09-03
7 2 12 2017-09-04
8 2 4 2017-09-05
9 2 3 2017-09-06
10 2 6 2017-09-07
Where total_sleep is expressed in hours.
What I am is trying to find is the absolute difference in hours of sleep for every two consecutive dates, given a specific user ID. The desired output should look something like this:
> df_answer
ID total_sleep sleep_end_date diff_hours_of_sleep
1 1 9 2017-09-03 NA
2 1 8 2017-09-04 1
3 1 7 2017-09-05 1
4 1 10 2017-09-06 3
5 1 11 2017-09-07 1
6 2 5 2017-09-03 NA
7 2 12 2017-09-04 7
8 2 4 2017-09-05 8
9 2 3 2017-09-06 1
10 2 6 2017-09-08 NA
NA appears in rows 1 and 6 because it doesn't have any data concerning the day before.
Most importantly, NA appears in row 10 because I don't have any data concerning the previous day (2017-09-07). And this has been the trickiest part to code for me.
I've googled (meaning: "stackoverflowed") this and tried to find a solution using the "data wrangling cheatsheet" for dplyr, but I haven't been been able to find a function that enables me to do what I want taking into account these two variables: date and different user IDs.
I am a beginner in R, so I might indeed be missing something simple. Any input or suggestion would be very welcome!