I am trying to calculate growth rates between the specific rows after grouping by another variable (similar to other users).
Here is a sample of my data:
squirrel_id wt age trialdate
22639 9.7 0 2017-04-20
22639 45.9 24 2017-05-14
22639 130 53 2017-06-12 #caught 3x, 1 trial
22640 10.3 0 2017-04-20
22640 49.2 24 2017-05-14
22640 121 52 2017-06-11
22640 196 84 2017-07-13 #caught 4x, 2 trials
23943 12.9 1 2018-04-27
23943 57.2 26 2018-05-23 #caught 2x, 1 trial
23760 150 73 2018-06-18
23760 165 84 2018-06-29 #caught 2x, 2 trials
To get this data in this form, I made sure to arrange_by(squirrel_id)
first using the library(dplyr)
package.
What I am trying to do is calculate the growth rate between:
- last observation and second last observation divided by time that has passed (last_wt-second_last_wt/last_age-second_last_age): [using
squirrel_id
22640: (196-121)/(84-52)] and then add a column called “trial” that says “2” - second last observation and third last observation divided by time that has passed (second_last_wt-third_last_wt/second_last_age-third_last_age): [using
squirrel_id
22640: (121-49.2)/(52-24)] and then add a column called “trial” that says “1”
There is a catch though:
if a
squirrel_id
is seen 3 times or less in total (likesquirrel_id
22639 and 23943), then they would usually only have had 1 trial and therefore 1 growth rate calculation.BUT, if 2 of the observations are with an age > 40 days old (like
squirrel_id
23760), then they had 2 trials.
My hope is to have a final dataset that looks like:
squirrel_id wt age trialdate g.rate trial
22639 9.7 0 2017-04-20 NA NA
22639 45.9 24 2017-05-14 NA NA
22639 130 53 2017-06-12 3.0 1 #caught 3x, 1 trial
22640 10.3 0 2017-04-20 NA NA
22640 49.2 24 2017-05-14 NA NA
22640 121 52 2017-06-11 2.6 1
22640 196 84 2017-07-13 2.3 2 #caught 4x, 2 trials
23943 12.9 1 2018-04-27 NA NA
23943 57.2 26 2018-05-23 1.7 1 #caught 2x, 1 trial
23760 150 73 2018-06-18 NA 1
23760 165 84 2018-06-29 1.4 2 #caught 2x, 2 trials
I’d prefer a dplyr()
solution, if possible.