0

I am trying to calculate mean of n values from one vector based on spread of values around x in another vector – within each categorical factor.

I have attached a data frame with sample data and expected results.

Basically I am looking at fish catch data and acoustic estimates from a number of lakes. The catch and acoustic (gha) data is stratified, as nets were set at different depths (some depths are missing, some are repeated). I want to increase the size of the original depth strata by pooling catch data from the adjacent depth strata (±2m).

The expected results (mean.catch, mean.g.ha) were calculated manually, where mean.catch and mean.g.ha are calculated as the mean of n catch where depth x = (x & x±2) for each lake.

lake <- c("a","a","a","a","a","a","a","a", "b","b","b","b","b", "b","b","b","b","b", "b","b", "b","b","b","b")

net.id <- c(1,1,1,1,2,2,2,2,1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4)

catch <- c(0:23)

g.ha <- c(1:24)

depth <- c(0, 1, 3, 4, 6, 7, 9, 10, 0, 1, 3, 4, 11, 13, 14, 16, 11, 12, 14, 15, 20, 22, 23, 25)

mean.catch <- c(0.5, 1, 2, 3, 4, 5, 6, 6.5, 8.5, 9, 10, 10.5, 14.5, 15.57142857, 16, 16.5, 14.5, 15, 16, 15.8, 20.5, 21, 22, 22.5)

mean.g.ha <- c(1.5, 2, 3, 4, 5, 6, 7, 7.5, 9.5, 10, 11, 12, 15.5, 16.57142857, 17, 17.5, 15.5, 16, 17, 16.8, 21.5, 22, 23, 23.5)

df <- data.frame(lake, net.id, depth, catch, g.ha, mean.catch, mean.g.ha)

The following answer R - Faster Way to Calculate Rolling Statistics Over a Variable Interval works but I have to create a subset for each lake. Is it possible to apply it to each lake seperately in one go rather than repeating code and creating a lot of subsets?

a <- subset(df, lake == "a")
as <- a[ ,c(1, 3)]
as
rollmean_r = function(x,y,xout,width) {
  out = numeric(length(xout))
  for( i in seq_along(xout) ) {
    window = x >= (xout[i]-width) & x <= (xout[i]+width)
    out[i] = .Internal(mean( y[window] ))
  }
  return(out)
}

x = a$depth
y = a$catch
As <- rollmean_r(x,y,xout=x,width=2)
Community
  • 1
  • 1
Emma
  • 1
  • 2

0 Answers0