-2

I'm trying to delete all rows in a dataframe when the average of a vector > an individual number in the vector. For some reason it seems to pick and choose which ones it deletes. All help is appreciated thank you, here is my code.

k<-c(HW2$AGE)
j<-mean(k)
for (i in HW2$AGE)
  if (j>i){
    HW2 <- HW2[-i, ]
  }
zx8754
  • 52,746
  • 12
  • 114
  • 209
  • 3
    `HW2[HW2$Age <= mean(HW2$AGE),]` or similar will give you the result you want I think. Generally selections or row removals shouldn't need looping code as R has vectorised functions built in. – thelatemail Feb 08 '16 at 00:35
  • THANK YOUUU!!! thank you for the help, I really apprciate it especially since the only coding experience I have is barely any Java and I'm still trying to learn R functions. – MCjuberfish Feb 08 '16 at 00:46
  • 1
    The `i` in your loop contains the value of `HW2$AGE`, not the index, so that won't do what you want. You're also trying to store the entire array every iteration. I strongly recommend a beginner R course such as Coursera. – Jonathan Carroll Feb 08 '16 at 00:47
  • Just remember that if you start writing `for (i in ...)` over any vector or rows of a matrix/data.frame, stop and reconsider if there is a function or extraction method that will do it all in one go. – thelatemail Feb 08 '16 at 00:48

1 Answers1

3

Don't need to vectorise. Instead I would use the below

Sample data

x <- data.frame("A"= runif(10), "B" = runif(10))

Calculate mean

xMean <- mean(x[,"A"])

Exclude rows

y <- x[x$A < xMean,]

This is probably the most obvious way of excluding unwanted rows

thelatemail
  • 91,185
  • 12
  • 128
  • 188
Celeste
  • 337
  • 4
  • 15