0

I'm trying to run a cor function to do PCA analysis. The dat frame I have clearly has the column name, I'm trying to ignore in the correlation. I'm getting an error message stating that object is not found.

Error in `[.data.frame`(ABCD, , -xyz) : object 'xyz' not found

In the above example 'xyz' is the column name. What should I be doing differently?

I'm trying to learn from the data set that is available in "HSAUR" package, called heptathlon.

> head(heptathlon)
                    hurdles highjump  shot run200m longjump javelin run800m score
Joyner-Kersee (USA)   12.69     1.86 15.80   22.56     7.27   45.66  128.51  7291

The column "score" is the eighth column and I get the error when I run:

> round(cor(heptathlon[,-score]), 2)
Error in `[.data.frame`(heptathlon, , -score) : object 'score' not found

If I substitute the column name with the column number, it seems to work. Clearly, I cannot use this approach for large data sets.

vkkumar
  • 3
  • 3
  • 1
    welcome to SO, please provide us some reproducible data. Can you show some rows of you data? and what was the code that resulted on that error? – Ananta Feb 13 '14 at 04:08
  • What commands are you issuing exactly?? – Apprentice Queue Feb 13 '14 at 04:30
  • I have added the commands to the text above. – vkkumar Feb 13 '14 at 04:34
  • are you trying to get `cor` for all except "score", you can not do that way. try `cor(hepthalon[,-8])` or `cor(subset(hepthalon(,select=-c("score")))` or `cor(hepthalon[,! names(hepthalon) %in% "score"])` – Ananta Feb 13 '14 at 04:40
  • @BlueMagister - please explain how you believe that question is a duplicate of this one. – Chris Stratton Feb 27 '14 at 21:47
  • @ChrisStratton OP is attempting to remove a column from a data frame. The methods to do so are at the question linked. Also consider: http://stackoverflow.com/questions/11565055/remove-a-column-from-a-data-frame-by-name – Blue Magister Feb 27 '14 at 22:02

1 Answers1

0

You can't remove a column by name with a - sign, like you can with numerical indices.

But you can easily remove a column by name by using logical indexing. Here's an example, removing the column Sepal.Width from iris:

head(iris, 2)
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5          1.4         0.2  setosa
2          4.9         3.0          1.4         0.2  setosa

i <- iris[,names(iris) != 'Sepal.Width']
head(i, 2)
  Sepal.Length Petal.Length Petal.Width Species
1          5.1          1.4         0.2  setosa
2          4.9          1.4         0.2  setosa

Note that - is not used, and the column name is quoted.

Matthew Lundberg
  • 42,009
  • 6
  • 90
  • 112