0

Using package:DBI, I need to:

  1. run a parametrized query with different parameters (i.e. a vector of parameters);
  2. get the results sets concatenated (i.e. rbinded as per R terminology or unioned as per SQL terminology);
  3. and get the resulting table in the database for further manipulation.

dbBind()/dbGetquery() fullfils requirements 1 and 2, but I then need to write the resulting data frame to the database using dbWriteTable(), which is ineficient:

library(DBI)
con <- dbConnect(RSQLite::SQLite(), ":memory:")
dbWriteTable(con, "iris", iris)

res <- dbGetQuery(con,
                  "select * from iris where Species = ?",
                  params = list(c("setosa", "versicolor")))

dbWriteTable(con, "mytable", res)

Conversely, dbExecute() fulfils requirement 3, but I don't think it has the "rbind feature". Of course, this throw an error because the table would get overwritten:

dbExecute(con,
          "create table mytable as select * from iris where Species = ?",
          params = list(c("setosa", "versicolor")))

What is the most efficient/recommended way of doing so?

Notes:

  • I am not the DBA and can only access the database through R.
  • My example is too trivial and could be achieved in a single query. My use case really requires a parametrized query to be run multiple times with different parameters.
  • I have to use Oracle, but I am interested in a solution even if it don't works with Oracle.
Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
Thomas
  • 457
  • 2
  • 12
  • 2
    Instead of query/download/combine/upload, why not use a parameterized query that inserts directly into a (temp?) table such as using `SELECT INTO ...`. – r2evans Feb 05 '23 at 19:26
  • I didn't have the idea! Could you please provide more details in an answer? (I'm quite new to SQL.) Thank you very much! – Thomas Feb 05 '23 at 19:56
  • 1
    Searching `oracle select into` returns several good links, I won't regurgitate them here. You can probably do something like your `create table mytable as ...` above with your _first_ query, and then from there do `select * into mytable from iris where Species=?` for each of your individual queries (no need to try to "union" them, frankly, though you can, perhaps with CTEs or such. – r2evans Feb 05 '23 at 21:30
  • 1
    FYI, you say *"parameterized query to be run multiple times"*, it might be possible to convert that process into something more fluid. If your parameters are (say) a `obj <- data.frame(a=1:2, b=3:4, d=7:8)` and you want to submit queries based on each _row_ of this frame, then perhaps: (1) insert this into a temporary table; (2) `select ... from realtable rt inner join mytemp mt on rt.a=mt.a and rt.e between mt.b and mt.d`; (3) clean up (remove the temp table). This can be done either as a "download" or as a "select into" methodology. – r2evans Feb 05 '23 at 21:34
  • Thank you for your comments. It showed me a method that I didn't know and allowed me to solve my issue. I found I have to use [`INSERT INTO` instead of `SELECT INTO`](https://stackoverflow.com/q/6947983/11148823) if the table that collects the results already exists. I posted an answer with the method I used. Comments are welcome! – Thomas Feb 07 '23 at 10:10
  • And many thanks for your second comment showing a more fluid approach. This cannot apply to my concrete use case (my example is too simple), but I will used it for sure! – Thomas Feb 07 '23 at 10:13

2 Answers2

1

1) Create the table with the first parameter and then insert each of the others into it.

library(RSQLite)

con <- dbConnect(SQLite())
dbWriteTable(con, "iris", iris)

parms <- c("setosa", "versicolor")

dbExecute(con, "create table mytable as
  select * from iris where Species = ?",
  params = parms[1])
for (p in parms[-1]) {
  dbExecute(con, "insert into mytable
    select * from iris where Species = ?",
    params = p)
}

# check
res <- dbGetQuery(con, "select * from mytable")
str(res)

2) Alternately generate the text of an SQL statement to do it all. sqldf pulls in RSQLite and gsubfn which supplies fn$ that enables the text substitution.

library(sqldf)

con <- dbConnect(SQLite())
dbWriteTable(con, "iris", iris)

parms <- c("setosa", "versicolor")
parmString <- toString(sprintf("'%s'", parms))
fn$dbExecute(con, "create table mytable as
  select * from iris where Species in ($parmString)")

# check
res <- dbGetQuery(con, "select * from mytable")
str(res)

3) A variation of (2) is to insert the appropriate number of question marks.

library(sqldf)

con <- dbConnect(SQLite())
dbWriteTable(con, "iris", iris)

params <- list("setosa", "versicolor")
quesString <- toString(rep("?", length(params)))

fn$dbExecute(con, "create table mytable as
  select * from iris where Species in ($quesString)", params = params)

# check
res <- dbGetQuery(con, "select * from mytable")
str(res)
G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341
  • Thank you for your answer. Your answer helped me to build a similar approach. I posted an answer with the method I used. Comments are welcome! – Thomas Feb 07 '23 at 10:03
0

Based on @r2evans comment and @G.Grothendieck answer, instead of query/download/combine/upload, I used a parameterized query that inserts directly into a table.

First, I created the table with the appropriate columns to collect the results:

library(DBI)
con <- dbConnect(RSQLite::SQLite(), ":memory:")

create_table <-
"
CREATE TABLE
  warpbreaks2 (
    breaks real,
    wool text,
    tension text
);
"

dbExecute(con, create_table)

Then I executed an INSERT INTO step:

dbWriteTable(con, "warpbreaks", warpbreaks)

insert_into <-
"
INSERT INTO
  warpbreaks2
SELECT
  warpbreaks.breaks,
  warpbreaks.wool,
  warpbreaks.tension
FROM
  warpbreaks
WHERE
  tension = ?;
"

dbExecute(con, insert_into, params = list(c("L", "M")))

This is a dummy example for illustration purpose. It could be achieve more directly with e.g.:

direct_query <-
"
CREATE TABLE
  warpbreaks3 AS
SELECT
  *
FROM
  warpbreaks
WHERE
  tension IN ('L', 'M');
"

dbExecute(con, direct_query )
Thomas
  • 457
  • 2
  • 12