I would like to run a "for" loop that uses three indices. Basically, I want to subset a data frame, find the mean of the subset, and place the mean value in a new data frame. I am having trouble running this loop; all I get is NaN's.
The first index is used to match the rows of the new data frame (which I call data.avg); The second index is used to index to a vector that will be used in the first half of the subsetting condition (that the date values be from a specific month); the second index is the same as the above, but for the second part of the subsetting condition (that the row is associated with a Breakfast/Dinner/Snacks).
# Create the data frame
data1 = data.frame(date = sort(rep(as.Date(42948:43101, origin = "1899-12-30"),3)),
serving = rep(c("Breakfast", "Dinner", "Snacks"), 154),
units = rep(c(1,5,49), 154)
)
View(data1[order(data1$date),])
# take mean of each subset and place it in a new data frame called data.avgs
# it should consist of 8x3 data frame; rows (column1) are "August","September", "October", "November", "December", "January","February", "March".
# columns should be "Breakfast", "Dinner", "Snack"
month.index = c(8:12, 1)
serving.index = c("Breakfast", "Dinner", "Snack")
# create the data frame with the means using placeholder data
data.avg = data.frame(months = c(month.name[8:12], month.name[1]),
bf.avg = c(1:6),
dinner.avg = c(1:6),
snack.avg = c(1:6))
# now start replacing; find the mean of the subset of the original data frame.
# find the mean of all dates that are for August, and whose serving type are for Breakfast.
for(j in 1:6){
for(i in month.index){
for(v in 2:4){
data.avg[j,v] = mean(
subset(data1,
months(data1$date) == month.name[i] & data1$serving == serving.index[v])$units
)
}
}
}
When I run the mean without the loop, for example, this;
mean(subset(data1,
months(data1$date) == "September" & data1$serving == "Breakfast")$unit)
I get the correct mean. Because of this, I am thinking that my issue may lie in the index setup.
Any and all help would be greatly appreciated,
Thanks
edit; fixed the above code. The resulting data frame is the following;
months bf.avg dinner.avg snack.avg
1 August 5 49 NaN
2 September 5 49 NaN
3 October 5 49 NaN
4 November 5 49 NaN
5 December 5 49 NaN
6 January 5 49 NaN
Here is what I am looking for;
mean(subset(data1,
+ months(data1$date) == "September" & data1$serving == "Breakfast")$unit)
[1] 1
> mean(subset(data1,
+ months(data1$date) == "September" & data1$serving == "Dinner")$unit)
[1] 5
> mean(subset(data1,
+ months(data1$date) == "September" & data1$serving == "Snacks")$unit)
[1] 49
My understanding is that these should be the data1.avg[1,1:3]