MS-SQL Bulk Insert with RODBC

Question

Is it possible to perform a bulk insert into an MS-SQL Server (2000, 2005, 2008) using the RODBC package?

I know that I can do this using freebcp, but I'm curious if the RODBC package implements this portion of the Microsoft SQL API and if not, how difficult it would be to implement it.

score 6 · Answer 1 · answered Feb 27 '17 at 15:07

6

check out the new odbc and DBI packages. DBI::dbWriteTable writes around 20,000 records per second... Much much faster than the Row Inserts from RODBC::sqlSave()

answered Feb 27 '17 at 15:07

quickreaction

675
5
17

I cannot install `odbc`, it generates error: `ERROR: lazy loading failed for package 'blob'` – user1700890 Jul 17 '19 at 21:48
That sounds like it requires a different question – quickreaction Jul 17 '19 at 23:08

score 2 · Answer 2 · edited Oct 04 '13 at 13:48

2

You're probably looking for ?sqlSave which uses a parametrized INSERT INTO query (taking place in one operation) when you set Fast=True.

edited Oct 04 '13 at 13:48

DMK

2,448
1
24
35

answered Sep 30 '09 at 17:32

Tyler

626
4
13

3

Nah, sqlSave will do multiple INSERTS. I want a BULK INSERT which is a single transaction. – ephpostfacto Nov 06 '09 at 21:42
does fast=true not do it as a single transaction? – Tyler Nov 09 '09 at 23:40
1

from the rodbc docs: "logical. If false, write data a row at a time. If true, use a parametrized INSERT INTO or UPDATE query to write all the data in one operation."..however it doesn't seem to make any difference (when writing to Netezza in my case) – jpmorris Oct 26 '16 at 20:27
I am afraid in my experience setting fast = T or fast = F doesn't change a thing. Records are inserted one by one no matter what. At least in my case with SQL Server 2012. – Matteo Castagna Mar 22 '19 at 15:23
For me, fast = TRUE, makes the sqlSave function ~ 3 times faster. Also on sql server 2012. – LeslieKish Apr 16 '19 at 21:30

score 2 · Answer 3 · answered Nov 10 '13 at 23:38

2

Now You can use dbBulkCopy from the new rsqlserver package:

A typical scenario:

You create a matrix
you save it as a csv file
You call dbBulkCopy to read fil and insert it using internally bcp tool of MS Sql server.

This assume that your table is already created in the data base:

dat <- matrix(round(rnorm(nrow*ncol),nrow,ncol)
id.file = "temp_file.csv"                      
write.csv(dat,file=id.file,row.names=FALSE)
dbBulkCopy(conn,'NEW_BP_TABLE',value=id.file)

answered Nov 10 '13 at 23:38

agstudy

119,832
17
199
261

any reason why rsqlserver is not on cran? – jangorecki Feb 06 '14 at 16:56
1

@MusX because it is under development(specially documentation and tests part) and it uses `rClr` package which is not on CRAN too. But you are encouraged to use it from GITHUB and will be happy with any feedback. – agstudy Feb 06 '14 at 17:00

score 2 · Answer 4 · answered Mar 10 '18 at 17:03

Using RODBC, the fastest insert we've been able to create (260 million row insert) looks like the following (in R pseudo code):

ourDataFrame <- sqlQuery(OurConnection, "SELECT myDataThing1, myDataThing2
                                         FROM myData")
ourDF <- doStuff(ourDataFrame)
write.csv(ourDF,ourFile)  
sqlQuery(OurConnection, "CREATE TABLE myTable ( la [La], laLa [LaLa]);
                         BULK INSERT myTable FROM 'ourFile' 
                              WITH YOURPARAMS=yourParams;")

If you're running this from between servers, you need a network drive that the R server can write to (e.g. one server with permissions for writing to the DB uses Rscript to productionalize the code), and the SQL Server can read from.

score 1 · Answer 5 · answered Aug 19 '15 at 14:05

From everything I can find, there is NO solution for bulk insert to MySQL and nothing that works with SSIS which is why Microsoft is including in-database analytics with SQL Server 2016 after buying Revolution R Analytics.

I tried to comment on the previous answer but don't have the reputation to do it.

The rsqlserver package needs to run with rClr and neither of those packages are well-behaved, especially because rsqlserver's INSERT functions have poor data type handling. So if you use it, you'll have no idea what you're looking at in the SQL table as much of the information in your data.frame will have been transformed.

Considering the RODBC package has been around for 15 years, I'm pretty disappointed that no one has created a bulk insert function...

important point on rsqlserver, but for many of us we don't need to 'look at' the data (from an R standpoint). If it's been modeled and shaped and processed in R we just need the result back in the database and it doesnt matter what R transforms the type to in the database (as long as they are reasonable and can be read by other systems) — jpmorris, Oct 26 '16 at 20:02
@Joe I don't think I'd ever hire someone who says "we don't need to 'look at' the data". — quickreaction, Jul 17 '19 at 23:12

score 1 · Answer 6 · answered Apr 21 '16 at 10:28

Our n2khelper package can use bcp (bulkcopy) when it is available. When not available it falls back to multiple INSERT statements.

You can find the package on https://github.com/INBO-Natura2000/n2khelper

Install it with devtools::install_git("INBO-Natura2000/n2khelper") and look for the odbc_insert() function.

MS-SQL Bulk Insert with RODBC

6 Answers6

Linked