0

In R, are the following sets of expressions functionally similar to one another?

  • assign(x,y), and eval(parse(text = "x")) and x=as.name("string")
  • get(x), and deparse(substitute(x)) and ____ (x=as.string(object))?
  • do.call("<-",list(x, y))

For example, can each of them be used in the following batch process?

## example data tables (requires data.table package I believe) 
(dt1 <- data.table(urlteststatus = letters[1:24], urltest = letters[1:24], istest = letters[23:24] ) )
(dt2 <- data.table(istest = letters[2:2], urlteststatus = letters[1:10], urltest = paste0(letters[3:3], letters[1:10]) ) )

# list of variables to iterate over
tlist <- list("dt1","dt2")

# Batch process: This creates a new variable "x.2" which concatenates the the variables "urltest", "istest", and "urlteststatus". 
for (x in tlist) {
  assign(paste0(x,".2"), mutate(assign(x, get(x)),
                            r_teststatus = paste(urltest,
                                                 istest,
                                                 urlteststatus)))
}

These examples are from: Convert string to a variable name, Adding data frames as list elements (using for loop), and How to convert string to a dataframe name pandas/python.

PS: Frequently in the response, I see comments like "that's not a good way to do it, a more R-like way would be to. . ." but they usually don't have a link to what the more R-like way would be. Please explain or link to what the "right" way is, and why.

Josh
  • 311
  • 3
  • 11
  • You should start by running all your examples with a variety of types and sizes of inputs and observing the outputs and the runtimes. Reading *carefully* the man pages for every function is also a good idea. Quite often the choice depends on exactly what you want your complete code to do. – Carl Witthoft Jan 04 '21 at 18:30
  • Also show the expected output from your code and explain what the intention of the code is. – G. Grothendieck Jan 04 '21 at 18:39
  • They all do different things — But as a general note you should almost never use *any* of these variants. 99% of the time they’re the wrong solution (with the possible exception of `do.call`). In all three questions you link to, as well as the code you’ve posted, they’re flat out the wrong solution. The better solution in all these cases is to use subsetting with lists or vectors, or to combine individual tables into a larger table. – Konrad Rudolph Jan 04 '21 at 19:06
  • @KonradRudolph Thanks! Can you link me to a response that covers why "subsetting with lists" or combining individual tables into a larger table is better? – Josh Jan 05 '21 at 08:13
  • 1
    @Josh It follows from how software is formally analysed. This analysis gets a lot more complex when you don’t know what variables exist at a given point — in other words, variables should be *static*. This matters because the same techniques are used intuitively when readers try to understand code. In other words, assuming that variables are static makes code (vastly!) more readable. — But much more mundanely, using lists is just a lot easier than all this mucking around with metaprogramming. It’s a lot less code to write and read. `foo[[i]] = x` *vs* `assign(paste0('foo', i), x)`. – Konrad Rudolph Jan 05 '21 at 09:57

1 Answers1

3

1) assign is used to dynamically assign to a variable whose name is passed as a character string and get is used to dynamically retrieve a variable whose name is passed as a character string.

assign("x", 3)
x
## [1] 3

get("x")
## [1] 3

Thus if you have a character vector of strings representing variables we can iterate over them and produce derived variables containing the dimensions like this:

nms <- c("mtcars", "iris")
for(nm in nms) assign(paste0(nm, ".dims"), dim(get(nm)))
mtcars.dims
## [1] 32 11
iris.dims
##[1] 150 5

although normally we would create a list (or possibly a matrix in this case) rather than leaving them loose.

Map(function(nm) dim(get(nm)), nms)
## $mtcars
## [1] 32 11
## 
## $iris
## [1] 150   5

or the following which gives the same result

Map(dim, mget(nms, inherits = TRUE))

2) deparse and substitute operate on expressions rather than character strings. deparse converts an expression to a character string. substitute substitutes values into an expression.

deparse(quote(1+2))
## [1] "1 + 2"

substitute(y * y, list(y = 3))
3 * 3 

3) do.call executes a function whose name is passed as a character string (or passed as a function object) and whose arguments are passed in a list.

do.call("sqrt", list(4))
## [1] 2
G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341