11

The dbWriteTable function in RPostgreSQL seems to ignore column names and tries to push data from R to PostgreSQL as-is. This is problematic when appending to existing tables, particularly if there are columns un-specified in the R object that should be given default values.

RMySQL handles this case very gracefully by adding the column names to LOAD DATA LOCAL INFILE. How do I force RPostgreSQL to assign default values to un-specified columns in dbWriteTable when append=TRUE?

Here is an example:

CREATE TABLE test (
    column_a varchar(255) not null default 'hello',
    column_b integer not null 
    );
insert into test values (DEFAULT, 1);

Which yields the following table:

select * from test;
 column_a | column_b 
----------+----------
 hello    |        1
(1 row)

I want to insert some new data to this table from R:

require('RPostgreSQL')
driver <- PostgreSQL()
con <- dbConnect(driver, host='localhost', dbname='development')
set.seed(42)
x <- data.frame(column_b=sample(1:100, 10))
dbWriteTable(con, name='test', value=x, append=TRUE, row.names=FALSE)
dbDisconnect(con)

But I get the following error:

Error in postgresqlgetResult(new.con) : 
  RS-DBI driver: (could not Retrieve the result : ERROR:  missing data for 
column "column_b"
CONTEXT:  COPY test, line 1: "92"
)

This is because I have not specified the column_a field, so dbWriteTable is trying to write the data for column_b into column_a. I would like to force dbWriteTable to use the defaults for column_a, and properly write column_b to column_b.

I should only get a failure if:

  1. I fail to specify a column with no default value
  2. I try to insert a column that doesn't exist in the table
  3. I insert the wrong datatype into an existing column
Zach
  • 29,791
  • 35
  • 142
  • 201
  • It would help if you wrote up a little example of creating a table, and then unsuccessfully trying to append to it. – nograpes Mar 04 '14 at 00:19
  • What happen when you use `x <- data.frame(column_a='hello', column_b=sample(1:100, 10))` ? I am not sure if defaults are supported... The other approach would be to send actual SQL 'insert' statements, rather than to rely on `dbWriteTable`. – Dirk Eddelbuettel Mar 04 '14 at 17:56
  • @Dirk Eddelbuettel It works fine in that case, but in my situation the R program doesn't always know the correct default value to use. Rather than guessing, I'd like to let the postgres server handle the default values. The RMySQL 'dbWriteTable' function handles default values in this situation, so I was hoping the RPostgreSQL 'dbWriteTable' function would as well. For now I'll send the actual SQL 'insert' statements, thank you for the suggestion. – Zach Mar 04 '14 at 18:02
  • 2
    This is fixed in RPostgres (http://github.com/rstats-db/RPostgres) – hadley Feb 18 '15 at 03:58
  • 1
    @hadley Thank you for writing this package! Please post your comment as an answer, as it's the best solution to my problem. – Zach Feb 18 '15 at 14:42

1 Answers1

7

I had exactly the same problem, this fixed it.

Check out the dbWriteTable2 function from package caroline.

The code then allows you to write a data frame without an id column into the database using add_id = TRUE, e.g.

dbWriteTable2(con_psql,"domains",data_domains,append=TRUE,overwrite=FALSE,row.names=FALSE,add.id=TRUE)
Alex
  • 15,186
  • 15
  • 73
  • 127
  • Does it add the column names to the insert statement so they don't need to be in the same order? – Zach May 20 '14 at 12:38
  • When I used it I had matching column names in my data frame in the same order. The only column I left out in my data frame was `id`. – Alex May 21 '14 at 00:47
  • New test: writing columns in random order + extra columns. It all worked, it even dealt with the extra column well: ` Found 'url' not in fields of 'rankings' table. Omiting.` – Alex May 21 '14 at 01:55
  • In fact, if there is a missing column in the data frame this function seems to just write `NULL` into the database. – Alex May 21 '14 at 02:06
  • Awesome. I'll have to check it out. Thanks! – Zach May 21 '14 at 13:14
  • Caroline is a very outdated package (last update from February 19, 2015 ) and RPostgres it's not working for me. Any other suggestion.? – useRj Apr 27 '17 at 10:40
  • if it does what you want, why are you worried? I still use it. – Alex Apr 27 '17 at 11:39