I am working with productivity data in a bird species. I would like to include pair experience, defined as the total number of clutches laid to date, as a variable to investigate if it has an effect on productivity.
To do this I need to count the total number of clutches laid before the current clutch for each pair.
Here is the data I am working with:
Pair.ID laydate
1 GGM 022 <NA>
2 GGM 022 <NA>
3 GGM 022 <NA>
4 GGM 019 26/03/2017
5 GGM 019 <NA>
6 GGM 019 <NA>
7 GGM 013 18/03/2017
8 GGM 021 <NA>
9 GGM 021 <NA>
10 GGM 021 <NA>
11 GGM 009 25/12/2016
12 GGM 009 14/01/2019
13 GGM 009 20/01/2019
14 GGM 029 <NA>
15 GGM 031 09/05/2019
16 GGM 031 19/06/2019
Here is what I want to get to:
Pair.ID laydate experience
1 GGM 022 <NA> NA
2 GGM 022 <NA> NA
3 GGM 022 <NA> NA
4 GGM 019 26/03/2017 0
5 GGM 019 <NA> NA
6 GGM 019 <NA> NA
7 GGM 013 18/03/2017 0
8 GGM 021 <NA> NA
9 GGM 021 <NA> NA
10 GGM 021 <NA> NA
11 GGM 009 25/12/2016 0
12 GGM 009 14/01/2019 1
13 GGM 009 20/01/2019 2
14 GGM 029 <NA> NA
15 GGM 031 09/05/2019 0
16 GGM 031 19/06/2019 1
A few things: 1) I need to keep the rows with NA as they are where pairs have had the oppurtunity to breed but did not. 2) I would like to have the information added to the mother dataframe, rather than creating a summary dataframe. 3) I would like to use dplyr if possible
I have looked around and tried to wrangle these solutions to fit my purpose but could not get them to work as needed: Rolling Count of Events Over Time Series and Count events before a specific time for a series of items in R