0

I'm trying to use the get function in R to reference and return a column in a data frame.

Taking this example data frame:

x <- data.frame(id= c("a", "b", "c"), term= c(179, 182, 179), col1= c(1, 2, 3), col2 = c(4, 5, 6))

Now, let's say I put the 2 column variable names into a vector

vars <- c("x$col1", "x$col2")

Then when I call get on vars, I want it to return the appropriate values, e.g. get(vars[2]) should ideally return x$col2.

However I get the following error when I try running get(vars[2])

> get(vars[2])
Error in get(vars[2]) : object 'x$col2' not found

But when I just run x$col2 there is no issue and I get the expected result:

> x$col2
[1] 4 5 6

So clearly the object x$col2 exists.

What am I doing wrong here?

beri
  • 89
  • 7

1 Answers1

4

This is because get() expects a variable name and x$col2 is not a variable name in R. x is a variable, $ is a function and col2 is a parameter to that function. This just basically like asking get("mean(1:3)") which doesn't make sense because that value isn't a variable. So the error message is right, x$col2 is not an object, but x is an object that has a named element col2. Rather than retrieve a variable, you need to execute that command that you've stored in a string.

You have a few options. If you want to execute a string as code, you can do

eval(parse(text="x$col1"))

Though this is generally not recommended because dangerous stuff could be in those strings.

You could just store the column names

vars <- c("col1", "col2")
x[[vars[2]]]

or you can use get() for the data.frame and the strings for the columns

mydata <- "x"
vars <- c("col1", "col2")
get(mydata)[[vars[2]]]

But it might be even better to take a step back to see how you got to this point in the first place. This isn't a type of thing you often need to do when using R in an R-like way. However you haven't provided much context about what you are really trying to accomplish so it's not easy to suggest an alternative strategy.

MrFlick
  • 195,160
  • 17
  • 277
  • 295
  • 1
    This is certainly the way to do it if you *must* do indirect variable access and indexing ... @beri, is there a reason you have to do it this way? Whenever I see `get` (along with `attach` and `assign`), I cringe at the inefficiency and scalability issues with the methodology. – r2evans Jun 06 '19 at 19:34
  • Thank you so so much for your detailed response! I really appreciate it! I'm new to StackOverflow so I will add some details about what worked and what I'm trying to do as a new "answer" below -- sorry if this is not the appropriate way to do it! – beri Jun 07 '19 at 14:11
  • I'll try to summarize the context: I'm trying to create a var that takes a value from one of the other vars on the df based on the value in var called "term." (updated df in orig ex to include it). As an experienced SAS/SAS macro programmer, I know what I want can be done very easily with arrays and loops. What I ended up doing was: xTerms <- c(179, 182) xVars <- c("col1", "col2") x$startVal <- NA for(i in 1:length(xTerms)){ x$startVal <- ifelse(x$term == xTerms[i], x[[xVars[i]]], x$startVal) } Is there a better way of creating flexible code in R that uses R more efficiently? – beri Jun 07 '19 at 15:06
  • I really can't tell what you are trying to do with that example. It would be better to open up a new question with a more complete [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output. Often R code will look very different from SAS code. – MrFlick Jun 07 '19 at 15:29
  • Yeah I had a bunch more info but it got deleted. I opened a new question: https://stackoverflow.com/questions/56497743/is-it-inefficient-to-write-r-code-that-indirectly-references-variables-to-bypass – beri Jun 07 '19 at 15:59