1

I know, we can use which() function to subset the data frame according to the given condition.If I have a data frame(dat) with 5 columns and I need to find the mean of 5th column of the rows with w,x,y,z as their 1st,2nd,3rd and 4th column ,then I would do as this-

 myrows<-dat[which((dat[,1]==w)&(dat[,2]==x)&(dat[,3]==y)&(dat[,4]==z)),5]
 mean(myrows)

Now, if there are 50 such columns then I certainly cannot hard-code as this-

 myrows<-dat[which((dat[,1]==w)&(dat[,2]==x)&...&(dat[,49]==zz),50]
 mean(myrows)

Is there any other way, something like using a for loop like-

 for(i in 1:50)
 myrows<-dat[which(dat[,i]==x[i])]

I am a beginner in R, so please give an easiest way around this?

  • Please provide reproducible example and expected output so it's easier for others to help you. Have a [look at this](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) for more info – Sotos Jun 16 '16 at 08:52
  • 1
    There is no reason for using `which` here. R supports logical subsetting. – Roland Jun 16 '16 at 09:03
  • ["When first learning subsetting, a common mistake is to use `x[which(y)]` instead of `x[y]`"](http://adv-r.had.co.nz/Subsetting.html) – RHertel Jun 16 '16 at 09:10

1 Answers1

2

You can set up a reference data.frame with one row, and the same columns as your df and then check row-wise for equality in all values like this:

reference <- data.frame(w, x, y, z)
mean(df[apply(df[,-5],1,function(x)all(x==reference)),5])
Miff
  • 7,486
  • 20
  • 20