11

I'm trying to insert data into a pre-existing PostgreSQL table using RPostgreSQL and I can't figure out the syntax for SQL parameters (prepared statements).

E.g. suppose I want to do the following

insert into mytable (a,b,c) values ($1,$2,$3)

How do I specify the parameters? dbSendQuery doesn't seem to understand if you just put the parameters in the ....

I've found dbWriteTable can be used to dump an entire table, but won't let you specify the columns (so no good for defaults etc.). And anyway, I'll need to know this for other queries once I get the data in there (so I suppose this isn't really insert specific)!

Sure I'm just missing something obvious...

Corvus
  • 7,548
  • 9
  • 42
  • 68
  • 1
    [Does this answer help][1]? The question appears to be similar to the one asked there. [1]: http://stackoverflow.com/questions/2186015/bind-variables-in-r-dbi – Joe Love Nov 25 '13 at 19:15
  • 1
    @JoeLove I think (but am not 100% sure) that that is specific to RSQLite. – joran Nov 25 '13 at 19:16
  • @joran, yeah, that came across wrong (deleted it). I sort of meant the opposite: that R is a stats language, it would be wrong to get upset that the db functionality is weak compared to a general purpose language. Just as you wouldnt expect glm in c#. As in: it's not R's fault I'm using the wrong tool for the job. – Corvus Nov 25 '13 at 20:46
  • @Corone Yeah, I actually agree with the general assessment. Much of R's db functionality is designed from the idea that the data is always going to go one way, from the db to R for analysis, I think. – joran Nov 25 '13 at 20:47

3 Answers3

8

I was looking for the same thing, for the same reasons, which is security.

Apparently dplyr package has the capacity that you are interested in. It's barely documented, but it's there. Scroll down to "Postgresql" in this vignette: http://cran.r-project.org/web/packages/dplyr/vignettes/databases.html

To summarize, dplyr offers functions sql() and escape(), which can be combined to produce a parametrized query. SQL() function from DBI package seems to work in exactly same way.

> sql(paste0('SELECT * FROM blaah WHERE id = ', escape('random "\'stuff')))
<SQL> SELECT * FROM blaah WHERE id = 'random "''stuff'

It returns an object of classes "sql" and "character", so you can either pass it on to tbl() or possibly dbSendQuery() as well.

The escape() function correctly handles vectors as well, which I find most useful:

> sql(paste0('SELECT * FROM blaah WHERE id in ', escape(1:5)))
<SQL> SELECT * FROM blaah WHERE id in (1, 2, 3, 4, 5)

Same naturally works with variables as well:

> tmp <- c("asd", 2, date())
> sql(paste0('SELECT * FROM blaah WHERE id in ', escape(tmp)))
<SQL> SELECT * FROM blaah WHERE id in ('asd', '2', 'Tue Nov 18 15:19:08 2014')

I feel much safer now putting together queries.

LauriK
  • 1,899
  • 15
  • 20
  • 2
    You can also use the new https://github.com/rstats-db/RPostgres which has full support for parameterised queries – hadley Feb 24 '15 at 16:22
  • Thanks @hadley. Got any examples on the parametrized part? I wrote a [blog post](http://welcome-to-data-science.blogspot.com/2015/02/how-to-manage-large-amount-of-data-with.html) a few days ago on the performance of large inserts and I'd like to do a comparison between the packages. – LauriK Feb 25 '15 at 09:19
  • 1
    Not yet, but I'm working on a blog post for the Rstudio blog next week. Note that parameterised inserts are slow for Postgres, so dbWriteTable in RPostgres uses a different strategy (LOAD INFILE ) – hadley Feb 25 '15 at 15:25
  • 1
    I guess, the question was about inserting/appending/updating data to specific columns. – RanonKahn Nov 28 '17 at 20:13
6

As of the latest RPostgreSQL it should work:

db_connection <- dbConnect(dbDriver("PostgreSQL"), dbname = database_name,
                   host = "localhost", port = database_port, password=database_user_password,
                   user = database_user)
qry = "insert into mytable (a,b,c) values ($1,$2,$3)"
dbSendQuery(db_connection, qry, c(1, "some string", "some string with | ' "))
Niels Kristian
  • 8,661
  • 11
  • 59
  • 117
  • 1
    I cannot get a similar thing to work, at least with `dbGetQuery`. Could you please provide a more detailed source? – Pranasas Apr 10 '17 at 09:03
4

Here's a version using the DBI and RPostgres packages, and inserting multiple rows at once, since all these years later it's still very difficult to figure out from the documentation.

x <- data.frame(
  a = c(1:10),
  b = letters[1:10],
  c = letters[11:20]
)

# insert your own connection info
con <- DBI::dbConnect(
  RPostgres::Postgres(),
  dbname = '', 
  host = '', 
  port = 5432, 
  user = '',
  password = ''
)

RPostgres::dbSendQuery(
  con, 
  "INSERT INTO mytable (a,b,c) VALUES ($1,$2,$3);",
  list(
    x$a,
    x$b,
    x$c
  )
)

The help for dbBind() in the DBI package is the only place that explains how to format parameters:

The placeholder format is currently not specified by DBI; in the future, a uniform placeholder syntax may be supported. Consult the backend documentation for the supported formats.... Known examples are:

  • ? (positional matching in order of appearance) in RMySQL and RSQLite
  • $1 (positional matching by index) in RPostgres and RSQLite
  • :name and $name (named matching) in RSQLite

? is also the placeholder for R package RJDBC.

tshynik
  • 127
  • 1
  • 8