10

In R's DBI package, I'm not finding a facility for using bound variables. I did find a document (the original vignette from 2002) that says about bound variables, "Perhaps the DBI could at some point in the future implement this feature", but it looks like so far that's left undone.

What do people in R use for a substitute? Just concatenate strings right into the SQL? That's got some obvious problems for safety & performance.

EDIT:

Here's an example of how placeholders could work:

query <- "SELECT numlegs FROM animals WHERE color=?"
result <- dbGetQuery(caseinfo, query, bind="green")

That's not a very well-thought-out interface, but the idea is that you can use a value for bind and the driver handles the details of escaping (if the underlying API doesn't handle bound variables natively) without the caller having to reimplement it [badly].

Ken Williams
  • 22,756
  • 10
  • 85
  • 147
  • can you provide some example code that would operate like you would like it to operate? what behaviour would you expect bound variables to have? – Alex Brown Feb 02 '10 at 17:18
  • Do you mean something like this? http://stackoverflow.com/questions/2182337/how-to-use-a-variable-name-in-a-mysql-statement – gd047 Feb 02 '10 at 17:25
  • 1
    I smell a new DBI with better features coming along. We will all be happy beta-testers, Ken.... – Dirk Eddelbuettel Feb 02 '10 at 19:01
  • That would be great if there's a new DBI in the works. Is there a code repository for DBI somewhere that I could hack on, in the vanishingly-small chance that I get some round tuits? – Ken Williams Feb 02 '10 at 22:14
  • 1
    No I was half-joking. DBI goes back to David James who no longer works where he once did. I mentored a Google Summer of Code student to create Postgresql bindings using DBI as a given; that project is now on CRAN as RPostgreSQL (and could do with a new maintainer for further extension as the student has left). A new DBI would be a worthy rewrite, or 'DBI 2.0', but also quite an undertaking. Are you up for it? You seem to be the one with the itch to scratch so ... – Dirk Eddelbuettel Feb 02 '10 at 22:37
  • Oh, you thought the smell of a new DBI was coming from me! I don't smell that. =) But I can take a look and see at least how difficult the technical aspects of this would be. – Ken Williams Feb 03 '10 at 16:33
  • Possible duplicate of [Proper way to pass parameters to query in R DBI](https://stackoverflow.com/questions/37131569/proper-way-to-pass-parameters-to-query-in-r-dbi) – Scarabee Jun 08 '17 at 22:29

3 Answers3

18

For anyone coming to this question like I just did after googling for rsqlite and dbgetpreparedquery, it seems that in the latest version of rsqlite you can run a SELECT query with bind variables. I just ran the following:

query <- "SELECT probe_type,next_base,color_channel FROM probes WHERE probeid=?"
probe.types.df <- dbGetPreparedQuery(con,que,bind.data=data.frame(probeids=ids))

This was relatively fast (selecting 2,000 rows out of a 450,000 row table) and is incredibly useful.

FYI.

user1076
  • 721
  • 7
  • 5
  • I'm unclear what the difference is between dbgetpreparedquery and dbsendpreparedquery – Carbon Jan 18 '15 at 05:28
  • 1
    As I understand it, the `GetQuery` functions return the results as a dataframe, whereas the `SendQuery` functions return a cursor from which you can request results in batches using the `fetch` method. Check out the documentation for RSQLITE::query http://cran.r-project.org/web/packages/RSQLite/RSQLite.pdf – David Marx May 12 '15 at 21:23
3

Below is a summary of what's currently supported in RSQLite for bound parameters. You are right that there is not currently support for SELECT, but there is no good reason for this and I would like to add support for it.

If you feel like hacking, you can get a read-only checkout of all of the DBI related packages here:

use --user=readonly --password=readonly

https://hedgehog.fhcrc.org/compbio/r-dbi/trunk
https://hedgehog.fhcrc.org/compbio/r-dbi/trunk/DBI
https://hedgehog.fhcrc.org/compbio/r-dbi/trunk/SQLite/RSQLite

I like to receive patches, especially if they include tests and documentation. Unified diff, please. I actually do all my development using git and so best case is to create a git clone of say RSQLite and then send me diffs as git format-patch -n git-svn..

Anyhow, here are some examples:

library("RSQLite")

make_data <- function(n)
{
    alpha <- c(letters, as.character(0:9))
    make_key <- function(n)
    {
        paste(sample(alpha, n, replace = TRUE), collapse = "")
    }
    keys <- sapply(sample(1:5, replace=TRUE), function(x) make_key(x))
    counts <- sample(seq_len(1e4), n, replace = TRUE)
    data.frame(key = keys, count = counts, stringsAsFactors = FALSE)
}

key_counts <- make_data(100)


db <- dbConnect(SQLite(), dbname = ":memory:")

sql <- "
create table keys (key text, count integer)
"

dbGetQuery(db, sql)

bulk_insert <- function(sql, key_counts)
{
    dbBeginTransaction(db)
    dbGetPreparedQuery(db, sql, bind.data = key_counts)
    dbCommit(db)
    dbGetQuery(db, "select count(*) from keys")[[1]]
}

##  for all styles, you can have up to 999 parameters

## anonymous
sql <- "insert into keys values (?, ?)"
bulk_insert(sql, key_counts)


## named w/ :, $, @
## names are matched against column names of bind.data

sql <- "insert into keys values (:key, :count)"
bulk_insert(sql, key_counts[ , 2:1])

sql <- "insert into keys values ($key, $count)"
bulk_insert(sql, key_counts)

sql <- "insert into keys values (@key, @count)"
bulk_insert(sql, key_counts)

## indexed (NOT CURRENTLY SUPPORTED)
## sql <- "insert into keys values (?1, ?2)"
## bulk_insert(sql)
seth
  • 259
  • 1
  • 3
1

Hey hey - I just discovered that RSQLite, which is what I'm using in this case, does indeed have bound-variable support:

http://cran.r-project.org/web/packages/RSQLite/NEWS

See the entry about dbSendPreparedQuery() and dbGetPreparedQuery().

So in theory, that turns this nastiness:

df <- data.frame()
for (x in data$guid) {
  query <- paste("SELECT uuid, cites, score FROM mytab WHERE uuid='",
                 x, "'", sep="")
  df <- rbind(df, dbGetQuery(con, query))
}

into this:

df <- dbGetPreparedQuery(
     con, "SELECT uuid, cites, score FROM mytab WHERE uuid=:guid", data)

Unfortunately, when I actually try it, it seems that it's only for INSERT statements and the like, not for SELECT statements, because I get an error: RS-DBI driver: (cannot have bound parameters on a SELECT statement).

Providing that capability would be fantastic.

The next step would be to hoist this up into DBI itself so that all DBs can take advantage of it, and provide a default implementation that just pastes it into the string like we're all doing ourselves now.

Ken Williams
  • 22,756
  • 10
  • 85
  • 147