4

I am unable to get the R language DBI::sqlAppendTable function to work with anything other than numbers. Below is a bit of code illustrating the problem. I suspect the problem is that sqlAppendTable does not quote data. Any fix or workaround would be greatly appreciated.

num = data.frame(matrix(1:26, ncol=2))
let = data.frame(matrix(letters, ncol=2))

test.sqlAppendTable = function(dfr) {
    #dfr: A data frame.
    conx <- dbConnect(RSQLite::SQLite(), ":memory:")
    on.exit(dbDisconnect(conx))
    dbWriteTable(conx, "temp", dfr[1:5, ])
    temp = dbReadTable(conx, 'temp')
    print(temp)
    sat = sqlAppendTable(conx, 'temp', dfr[6:10, ])
    print(sat)
    rs = dbExecute(conx, sat)
    cat('Result set (rs): ')
    print(rs)
    temp = dbReadTable(conx, 'temp')
    print(temp)
}

test.sqlAppendTable(num) #Runs fine.
test.sqlAppendTable(let) #Generates error:
#Error in rsqlite_send_query(conn@ptr, statement) : no such column: j
Argent
  • 885
  • 2
  • 9
  • 18
  • 1
    Why not use RSQLite's `dbWriteTable`, specifying *append* argument? – Parfait Apr 01 '17 at 20:32
  • dbWriteTable with append sounds promising. – Argent Apr 02 '17 at 21:54
  • However, in the documentation of dbWriteTable I see no mention of append: http://127.0.0.1:31912/library/DBI/html/dbReadTable.html – Argent Apr 02 '17 at 22:10
  • Check official [CRAN docs](https://cran.r-project.org/web/packages/RSQLite/RSQLite.pdf) where *append* is an argument in `dbWriteTable` method. – Parfait Apr 03 '17 at 02:26
  • @Parfait: You are right that dbWriteTable has an append option. But see my revised answer below: Writing a temporary table to the database, and inserting it into the target table, runs faster. – Argent Apr 06 '17 at 20:28

3 Answers3

3

I've run into this issue one too many times to not take a stab at writing my own work-around. Personally, I ran into this same issue with Microsoft SQL Server, but I figured this same solution would work for SQLite. I'm working with:

  • Database: Microsoft SQL Server hosted in Azure
  • R: 3.5.0
  • DBI: 1.0.0
  • odbc: 1.1.6
  • OS: Ubuntu 18.04

Approach:

I wanted to avoid looping through rows for the sake of efficiency. I found that mapply and paste0 could be combined in a more column-oriented fashion.

I'll admit it's a bit "hacky," but it's been working well for myself. Use at your own risk; I'm only using this for a small side project, not an enterprise solution. Efficiency shouldn't be that big of an issue anyway, since there's a 1000 row limit on inserts anyway.

Replacement for "sqlAppendTable":

db_sql_append_table <- function(p_df, p_tbl) {
    # p_df: data.frame that contains the data to append/insert into the table
    # the names must be the same as those in the database
    # p_tbl: the name of the database table to insert/append into
    
    num_rows <- nrow(p_df)
    num_cols <- ncol(p_df)
    requires_quotes <- sapply(p_df, class) %in% c("character", "factor", "Date")
    commas <- rep(", ", num_rows)
    quotes <- rep("'", num_rows)
    
    str_columns <- ' ('
    column_names <- names(p_df)
    
    for(i in 1:num_cols) {
        if(i < num_cols) {
            str_columns <- paste0(str_columns, column_names[i], ", ")
        } else {
            str_columns <- paste0(str_columns, column_names[i], ") ")
        }
    }
    
    str_query <- paste0("INSERT INTO ", p_tbl, str_columns, "\nVALUES\n")   
    str_values <- rep("(", num_rows)
    
    for(i in 1:num_cols) {
        
        # not the last column; follow up with a comma
        if(i < num_cols) {
            if(requires_quotes[i]) {
                str_values <- mapply(paste0, str_values, quotes, p_df[[column_names[i]]], quotes, commas)        
            } else {
                str_values <- mapply(paste0, str_values, p_df[[column_names[i]]], commas)
            }
            
        # this is the last column; follow up with closing parenthesis
        } else {
            if(requires_quotes[i]) {
                str_values <- mapply(paste0, str_values, quotes, p_df[[column_names[i]]], quotes, ")")
            } else {
                str_values <- mapply(paste0, str_values, p_df[[column_names[i]]], ")")
            }
        }
    }
    
    # build out the query; collapse values with comma & newline; end with semicolon;
    str_values <- paste0(str_values, collapse=",\n")
    str_query <- paste0(str_query, str_values)
    str_query <- paste0(str_query, ";")
    return(str_query)
}

Calling the function:

I wanted to keep this as similar to the original sqlAppendTable function as possible. This function only constructs the query.

You still have to wrap this function in a call to dbExecute() to actually insert/append rows to the database.

dbExecute(conn=conn, statement = db_sql_append_table(my_dataframe, "table_name"))

EDIT

  • Added "Date" as one of the types that needs to be quoted by this function as well. Thanks for that comment!
Community
  • 1
  • 1
TaylorV
  • 846
  • 9
  • 13
  • 1
    Great answer / workaround - cannot believe they still have not managed to get that INSERT working decently?! I noticed, and as future reference for anyone else having these issues, at least with MS SQL databases it also helps to add "Date" class to code line 3 ie `requires_quotes <- sapply(p_df, class) %in% c("character", "factor", "Date")` – GWD Jun 04 '20 at 00:04
0

This looks like a deficiency in the RSQLite package; values should be quoted indeed. Note that the quotes are added in the following example:

DBI::sqlAppendTable(DBI::ANSI(), table = "test", data.frame(a = 1, b = "2"))
#> <SQL> INSERT INTO "test"
#>   ("a", "b")
#> VALUES
#>   (1, '2')

See also the corresponding GitHub issue.

krlmlr
  • 25,056
  • 14
  • 120
  • 217
0

I have been able to cobble together a workaround:

1) Write the data frame that you want to append as a temporary table (temp) in the SQLite database file.

2) Append it to the target table (target) using the SQLite statement:

insert into target select * from temp;

3) Drop temp.

This runs quite fast, presumably thanks to SQLite being well optimized.

ADDENDUM:

You can indeed append a data frame to a database table using dbWriteTable, with the option append=TRUE. I have tested this against my workaround described above and, surprisingly, the workaround runs almost 40% faster than dbWriteTable.

Argent
  • 885
  • 2
  • 9
  • 18