0

I have a data frame which contains many columns. I want to create a new data frame which contains only some of the columns, so I've used subset which works great.

newDF<-subset(oldDF, col1==1)

To complicate things-- I want that one of the columns in the subset will be identified using an interval, such as X. For example, I want the new dataFrame to contain all rows from the oldDF in which the values of Col2Name are bigger than zero

X <- "colName2"
newDF<-subset(oldDF, X>0)

the problem is that when I run this using X, I get nothing.

When I run this using the specific column name (and not an interval)

newDF<-subset(oldDF, colName2>0)

I get the right results.

when I test the value in X using oldDF[,X] I get the right column.

What am I missing? what am I doing wrong?

Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
Neta
  • 1

2 Answers2

1

You may try this way:

newdf <- olddf[(olddf$colName2 > 0),]
0

Since you have not provided data I am using mtcars dataset as an example. When you use :

subset(mtcars, cyl == 4)

R is looking for column named cyl in mtcars. Now when you do :

X <- "cyl" 
subset(mtcars, X == 4)

R will here look for column named X in mtcars which obviously is not present and hence it gives an empty dataframe.

There are certain ways in which you can subset a dataframe passing a variable and using subset is not one of them. You'll also notice that mtcars$cyl works fine but mtcars$X will not work for the same reason.

When you want to subset a dataframe using a variable you can use (as you have already figured out) mtcars[, X] or also mtcars[[X]].

Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
  • Thanks for the explanation! this is what I thought. Though I can't figure out how to use it to subset with a condition? mtcars[ , X>0] doesn't work. Thanks! – Neta Oct 16 '20 at 11:09
  • @Neta Use `mtcars[mtcars[[X]] > 0, ]` – Ronak Shah Oct 16 '20 at 11:44