I'm working with SQL and user input. So I'm using the glue
library to handle with parameterized queries.
However, to keep things clean, I wrapped that all in a function:
safeQuery <- function(con, sql, ...) {
sql = glue_sql(sql, ..., .con=con)
query <- dbSendQuery(con, sql)
out <- dbFetch(query)
dbClearResult(query)
return(out)
}
So I just call that function with the connection, the SQL code and a list of parameters for that SQL code which glue_sql
binds appropriately.
This works just fine.
Now, I have a specific SQL call which I use in one way or another quite often, but with varying parameters.
So I decided to create a function for that:
get_data <- function(con, params) {
safeQuery(con,
"SELECT *
FROM foo
WHERE bar IN ({vars*})",
vars=params)
}
p = c(1, 2)
get_data(con, p)
So, the user data (in this case c(1, 2)
) would be passed to get_data
, which would pass it along with the SQL call to safeQuery
, where glue_sql
will take care of the bindings.
However, if I actually try to run get_data
, I get an error
object 'params' not found
Googling and SO'ing has made it clear this has to do with R's lazy evaluation.
And indeed, changing get_data
to
get_data <- function(con, params) {
do.call("safeQuery",
list(con,
"SELECT *
FROM foo
WHERE bar IN ({vars*})",
vars=params)
}
(as recommended by this answer) works just fine, since do.call
evaluates the parameters in the list before sending them to safeQuery
.
What I don't understand is why this is necessary in the first place. After all, the value of params
isn't modified at any step along its way to glue_sql
, so it should still be available.
The linked answer discusses the use of substitute
(I also read this R-bloggers post on the subject) to replace the argument's name with the caller's name for it (or with its actual value if the argument value was given directly), but that didn't work in my case. Modifying get_data
to use substitute
:
get_data <- function(con, params) {
do.call("safeQuery",
list(con,
"SELECT *
FROM foo
WHERE bar IN ({vars*})",
vars=substitute(params))
}
resulted in the following SQL from glue_sql
:
SELECT *
FROM foo
WHERE bar IN (params)
instead of the actual values of params
. I couldn't try the same within safeQuery
since the parameters are hidden in ...
and substitute(...)
doesn't work. I tried.
I've also tried calling force(params)
at the start of get_data
, but that gave the same object not found
error.
get_data <- function(con, params) {
force(params)
do.call("safeQuery",
list(con,
"SELECT *
FROM foo
WHERE bar IN ({vars*})",
vars=params)
}
So, why does params
get "lost" with the standard calls? Why does do.call
work, but not force(params)
? Is it possible to make this work using standard evaluation?
And I'm not going to lie: this experience has confused me about how to write my functions and handle their parameters (I'm considering only using do.call
from now on). If tips can be given without over-extending the scope of this question, I'd be much obliged.