1

So I have a data set that contains sites, years, and a measured variable (let's say, x). x is measured a number of times throughout the year, across many years, and at multiple sites. Here is an example of my data set (each x was collected at different dates, I've simply extracted the year out of the dates as I'm interested in annual means). Let's call the data set df:

>df

site  year   x
  a   2000  10
  a   2000  12
  a   2000  13
  b   2000  14
  b   2000  15
  b   2000  17
  c   2000   9
  c   2000  11
  c   2000  11
  a   2001  11
  a   2001  12
  a   2001  12
  b   2001  13
...

and it goes on for multiple years.

I want to extract the mean of x for each specific site and year. I wrote a for loop, but am having trouble with it. I'd like to return a data frame with site, year, and average for x, but it seems to take the mean of all variables found in df$x as the first value, and then returns NaNs for the rest of the results.

Here is my code:

temp <- NULL;
mn.x <- NULL;
a <- NULL;
for(i in unique(df$site)) {
for (j in unique(df$year)) {
    site <- i;
    year <- j;
    a <- data.frame(site, year);
    temp <- mean(na.omit(df$x[df$site==i && df$year==j]))
    site.year <- data.frame(a, temp)
        mn.x <- rbind(temp, site.year)
  } 
}

Just to be clear...the result that returns when I type mn.x in R is

>mn.x
 [1] 10.4
 [1] NaN
 [1] NaN
 [1] NaN
 [1] NaN
...

where 10.4 is the mean of x for all values of df$x (aka mean(df$x))

What's wrong with my loop? Or, as this is an example data set, perhaps there is actually a problem with my dataset? Just to clarify...class(df$x) is "numeric"

Thanks for any thoughts,
Paul

  • 2
    Yet another aggregate question it seems: `aggregate(x ~ site + year, data=df, mean)` – thelatemail Nov 21 '13 at 23:40
  • C'mon, Paul. Really try to do some searching next time. There are so many get-mean-by-group questions in SO that it would be hard to miss at least one of them. And do learn the difference btwn "&&" and "&". – IRTFM Nov 22 '13 at 01:44
  • 2
    I guess getting your ass handed to you is also how one learns sometimes... – logicForPresident Nov 22 '13 at 03:30

1 Answers1

0

A popular way of doing this is by using plyr...

require(plyr)
ddply(df, .(site,year), summarize, xm=mean(x))
ndr
  • 1,427
  • 10
  • 11