I have a large data frame with 2107377 observations of 46 variables. I have a function that subsets this data frame based on the day of the year:
subset.function = function(dataset,year,focal.year,day.of.year) {
subset(dataset, year==focal.year & day.of.year<=ifelse(leap_year(focal.year), 112,111))
}
The data were collected from 2004-2014. I want to create 11 data frames from this data frame, each consisting of all of the data associated with the first 111 (or 112, in a leap year) days of the focal year (focal year = 2004, 2005, 2006, etc.).
I could do this by applying my subset function 11 times, each time storing it in a new variable:
variable1 = subset.function(dataset, year, 2004, day.of.year)
variable2 = subset.function(dataset, year, 2005, day.of.year)
...
variable11 = subset.function(dataset, year, 2011, day.of.year),
but that's not very fun. I've tried to use a for loop to do this with fewer lines, but it doesn't work:
test = vector("list")
for (i in 1:years) {
test[[i]] = subset.function(dataset, year, focal.year[i], day.of.year)
}
This creates a large list with the same number of elements in each item of the list as the original data frame. I've also tried using the apply
family of functions:
apply(dataset, year, focal.year[1:11], day.of.year)
with equally disappointing results. What am I missing?