-1

Question:

How to apply functions to a subset of the data in a vectorized manner.

Example:

For the data frame below:

x=c(1,2,1,2,1,2)
y=c(3,4,5,4,3,2)
df=data.frame(x,y)

I would like to apply a function (i.e. min()) to all y values for each of the x value, and collect it in a vector.

Basically, I would like to have a vectorized version of this:

nb = max(x);
V = rep(0.0, nb)
for(i in 1:nb){
    v = df [ x == i,  ]$y;
    V[i] <- min(v);
}

# basically here:
# V[1] = min( df$y for x=1)
# V[2] = min( df$y for x=2)
Theo
  • 1,385
  • 2
  • 10
  • 19

1 Answers1

3

The function tapply is designed for such problems:

with(df,tapply(y,x,FUN=min))
#1 2 
#3 2

If you want to add the results to your data frame, you can use the function ave:

df$group.min <- with(df,ave(y,x,FUN=min))
#   x y group.min
# 1 1 3         3
# 2 2 4         2
# 3 1 5         3
# 4 2 4         2
# 5 1 3         3
# 6 2 2         2
Blue Magister
  • 13,044
  • 5
  • 38
  • 56