0

I am trying to figure out how to run two different loops on the same code. I am trying to create a matrix where I am filling a column with the mean of a variable for each year.

Here's the code I am using to do it right now:

matplot2 = as.data.frame(matrix(NA, nrow=16, ncol=4))

matplot2[1,1] = mean(matplot[matplot$Year==2003, 'TotalTime'])
matplot2[2,1] = mean(matplot[matplot$Year==2004, 'TotalTime'])
matplot2[3,1] = mean(matplot[matplot$Year==2005, 'TotalTime'])
matplot2[4,1] = mean(matplot[matplot$Year==2006, 'TotalTime'])
matplot2[5,1] = mean(matplot[matplot$Year==2007, 'TotalTime'])
matplot2[6,1] = mean(matplot[matplot$Year==2008, 'TotalTime'])
matplot2[7,1] = mean(matplot[matplot$Year==2009, 'TotalTime'])
matplot2[8,1] = mean(matplot[matplot$Year==2010, 'TotalTime'])
matplot2[9,1] = mean(matplot[matplot$Year==2011, 'TotalTime'])
matplot2[10,1] = mean(matplot[matplot$Year==2012, 'TotalTime'])
matplot2[11,1] = mean(matplot[matplot$Year==2013, 'TotalTime'])
matplot2[12,1] = mean(matplot[matplot$Year==2014, 'TotalTime'])
matplot2[13,1] = mean(matplot[matplot$Year==2015, 'TotalTime'])
matplot2[14,1] = mean(matplot[matplot$Year==2016, 'TotalTime'])
matplot2[15,1] = mean(matplot[matplot$Year==2017, 'TotalTime'])
matplot2[16,1] = mean(matplot[matplot$Year==2018, 'TotalTime'])

If it were just the year changing, I would write the loop like this:

for(i in 2003:2018) {
     matplot2[1,1] = mean(matplot[matplot$Year==i, 'TotalTime'])
}

But, I need the row number in the matrix I'm printing the results into to change as well. How can I write a loop where I am printing the results of all these means into one column of a matrix?

In other words, I need to be able to have it loop matplot2[j,1] in addition to the matplot$Year==i.

Any suggestions would be greatly appreciated!

James
  • 3
  • 2

2 Answers2

4

Your literal calculations of the mean(TotalTime) can all be reduced to a single command (with no for loop required):

matplot2 <- aggregate(TotalTime ~ Year, data = matplot, FUN = mean)

That should return a two-column frame with the unique values of Year in the first column, and the respective means in the second column.

Demonstrated with data I have:

head(mtcars)
#                    mpg cyl disp  hp drat    wt  qsec vs am gear carb
# Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
# Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
# Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
# Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
# Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
# Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1
res <- aggregate(disp ~ cyl, data = mtcars, FUN = mean)
res
#   cyl     disp
# 1   4 105.1364
# 2   6 183.3143
# 3   8 353.1000

This and more can be seen in summarize by group (of which this question is essentially a dupe, even if you didn't know to ask it that way).

r2evans
  • 141,215
  • 6
  • 77
  • 149
2

R is a vectorized language so passing a vector of values for the index and year should work.

i<-1:16
matplot2[i,1] = mean(matplot[matplot$Year==(2002 + i), 'TotalTime'])
Dave2e
  • 22,192
  • 18
  • 42
  • 50
  • 1
    r2evans response seems to be the best option for what I'm doing, but thanks so much for responding as there have been other times I've needed to do something like this and this would have been so useful – James Nov 10 '22 at 01:45
  • @James, I agree with your conclusion, you should accept his answer to close the question and you can earn a badge. – Dave2e Nov 10 '22 at 02:21