I have a dataframe with counts of geese at several different sites. The aim was to make monthly counts of geese in all 8 months between September-April at each site in consecutive winter periods. A winter period is defined as the 8 months between September-April.
If the method had been carried out as planned, this is what the data would look like:
df <- data.frame(site=c(rep('site 1', 16), rep('site 2', 16), rep('site 3', 16)),
date=dmy(rep(c('01/09/2007', '02/10/2007', '02/11/2007',
'02/12/2007', '02/01/2008', '02/02/2008', '02/03/2008',
'02/04/2008', '01/09/2008', '02/10/2008', '02/11/2008',
'02/12/2008', '02/01/2009', '02/02/2009', '02/03/2009',
'02/04/2009'),3)),
count=sample(1:100, 48))
Its ended up with a situation where some sites have all 8 counts in some September-April periods, but not in other September-April periods. In addition, some sites, never achieved 8 counts in a September-April period. These toy data look like my actual data:
df <- df[-c(11:16, 36:48),]
I need to remove rows from the dataframe which do not form part of 8 consecutive counts in a September-April period. Using the toy data, this is the dataframe I need:
df <- df[-c(9:10, 27:29), ]
I've tried various commands using ddply()
from plyr
package but without success. Is there a solution to this problem?