Questions tagged [rjdbc]

RJDBC - Allows the use of JDBC to access databases from R

About

RJDBC is an package providing a database interfaces using . This allows the use of any in R through the JDBC interface. The only requirement is working and a JDBC driver for the database engine to be accessed. RJDBC uses the interface which is also used by the packages , , , and .

Example

library(RJDBC)

drv <- JDBC("com.mysql.jdbc.Driver",
           "/etc/jdbc/mysql-connector-java-3.1.14-bin.jar",
           identifier.quote="`")

conn <- dbConnect(drv, "jdbc:mysql://localhost/test", "user", "pwd")

Loads a JDBC driver for MySQL (adjust the path to the driver's JAR file as necessary) and connects to the local database "test". The connection handle conn is used for all subsequent operations.

(Note: Windows users can use drv<-JDBC("sun.jdbc.odbc.JdbcOdbcDriver") to use Sun's JDBC to ODBC Bridge with "jdbc:odbc:..." URL.)

In order to allow more complex names for tables and columns, it is a good idea to set identifier.quote to the quote character supported by the database for quoting identifiers. The default (NA) does not quote identifiers, but this limits the names that can be used, affecting dbReadTable and dbWriteTable.

dbListTables(conn)
data(iris)
dbWriteTable(conn, "iris", iris, overwrite=TRUE)
dbGetQuery(conn, "select count(*) from iris")
d <- dbReadTable(conn, "iris")

RJDBC supports prepared statements and argument substitution, so it is possible to run queries like:

dbGetQuery(conn, "select count(*) from iris where Species=?", "setosa")

Note that the life time of a connection, result set, driver etc. is determined by the lifetime of the corresponding R object. Once the R handle goes out of scope (or if removed explicitly by rm) and is garbage-collected in R, the corresponding connection or result set is closed and released. This is important for databases that have limited resources (like Oracle) - you may need to add gc() by hand to force garbage collection if there could be many open objects. The only exception are drivers which stay registered in the JDBC even after the corresponding R object is released as there is currently no way to unload a JDBC driver (in RJDBC).

Type Handling

Type-handling is a rather complex issue, especially with JDBC as different databases support different data types. RJDBC attempts to simplify this issue by internally converting all data types to either character or numeric values. When retrieving results, all known numeric types are converted to R's numeric representation and all other types are treated as characters. When assigning parameters in parametrized queries, numeric, integer and character are the types used. Convenience methods like dbReadTable and dbWriteTable can only use the most basic SQL types, because they don't know what DBMS will be used. Therefore dbWriteTable uses only INTEGER, DOUBLE PRECISION or VARCHAR(255) to create the table. For all other types you'll have to use DBML statements directly.

License

RJDBC is released under v2.

Repositories

Other resources

Related tags

200 questions
22
votes
4 answers

R RJDBC java.lang.OutOfMemoryError

I got following error while run dbGetQuery of RJDBC 0.2-4: Error in .jcall(rp, "I", "fetch", stride) : java.lang.OutOfMemoryError: Java heap space Following is my code: Sys.setenv(JAVA_HOME='C:/Program Files…
user3567601
  • 241
  • 1
  • 2
  • 4
14
votes
2 answers

rjdbc Parallel query with parallelmap

I'm trying to run my query in parallele and i get an 00001: Error in .jcheck() : No running JVM detected. Maybe .jinit() would help. error. The queries are working when i run them one by one My script: I know it's not really reproductible but i…
s.brunel
  • 1,003
  • 10
  • 24
13
votes
2 answers

How to prevent truncation of error messages in R

I am querying a database in R using RJDBC. The queries are built up from data which is read in from a file. These queries can get very long, and can potentially include non existent columns (resulting in an error). Below is a simplified example, it…
littlefeltfangs
  • 396
  • 2
  • 17
13
votes
6 answers

Connecting to MS SQL Server from R on Mac/Linux

I'm trying to connect to a Microsoft SQL Server from R on Mac/Linux, and I have problems with RJDBC. When I've downloaded both Microsoft's driver and JTDS, but none of the following lines work: library(RJDBC) drv <-…
Victor K.
  • 4,054
  • 3
  • 25
  • 38
11
votes
1 answer

Write to SAP HANA with RJDBC using dbWritetable very slow due to record by record insert

I am trying to write a large dataset (10 cols, 100M records) from R to SAP HANA using RJDBC's dbWritetable in the following way library("RJDBC") drv <- JDBC("com.sap.db.jdbc.Driver", "/data/hdbclient/ngdbc.jar", "'") database <- dbConnect(…
Michiel
  • 179
  • 1
  • 14
11
votes
2 answers

Are there performance/other downsides in creating a new RJDBC connections to MS SQL database for each request?

I would like to understand what is the best practice for (re)using SQL connections to a MS SQL database through RJDBC. I can imagine three possible scenarios: Store a connection in a global variable, initialize once, use it everywhere in the…
Victor K.
  • 4,054
  • 3
  • 25
  • 38
8
votes
3 answers

Error loading csv data into Hive table

i have a csv file in hadoop and i have a Hive table ,now i want to laoad that csv file into this Hive table i have used load LOAD DATA local 'path/to/csv/file' overwrite INTO TABLE tablename; ended up with this error : Error in…
Teja Nandury
  • 83
  • 1
  • 1
  • 4
7
votes
1 answer

R JDBC error "Unable to retrieve JDBC result set for insert into ..."

I am trying to write an R data.frame to a Netezza table. It has about 55K rows and I have set 4GB as memory limit for Java (options(java.parameters = "-Xmx4096m")) Query: insert into MY_TABLE_NAME select * from external…
KalC
  • 1,530
  • 3
  • 22
  • 33
7
votes
1 answer

RJDBC(dbGetQuery) - GC overhead limit exceeded

I'm using the package RJDBC and there the function dbGetQuery to get the ouput of a SQL query. The code works with SQL statements with not so much rows, but which statements which rows > 1.000.000 I get an error. Is there a parameter to handle the…
user43348044
  • 305
  • 3
  • 15
6
votes
1 answer

Writing a data frame to a Teradata table using RJDBC

After establishing a connection as explained here, I tried to write a very simple data frame (trythis, shown below) to a table called gh_test_20141105 in a database called p_cia_t. First, I tried > conn <- getTdConnection(vdm='vivaldi') >…
Gabi
  • 1,303
  • 16
  • 20
5
votes
0 answers

How do you capture rJava standard out as R character vector?

When making a call using rJava, is there a way to capture the output java writes to the console/stdout in a character vector in R? Is there a way to do this without redirecting java's stdout to file and having R read the file? In my specific use…
Marcus
  • 3,478
  • 1
  • 7
  • 16
5
votes
1 answer

Server out-of-memory issue when using RJDBC in paralel computing environment

I have an R server with 16 cores and 8Gb ram that initializes a local SNOW cluster of, say, 10 workers. Each worker downloads a series of datasets from a Microsoft SQL server, merges them on some key, then runs analyses on the dataset before writing…
5
votes
2 answers

Java exception error during RJDBC::dbConnect: how to silently ignore?

I am trying to connect to a SAS-driven remote database from within R, using RJDBC. The first time I do a dbConnect, I get an error: Error in .jcall(drv@jdrv, "Ljava/sql/Connection;", "connect", as.character(url)[1], : java.lang.NoClassDefFoundError:…
Wouter Thielen
  • 1,016
  • 9
  • 21
4
votes
2 answers

RJDBC: Error in class(obj) <- "rs.scalar" : cannot set attribute on a symbol

I am trying to run my R code, which worked perfectly fine two months ago but now crashes when I try to read some data of Teradata into R, in an sql-chunk. The error that shows up is the following: Error in class(obj) <- "rs.scalar" : cannot set…
Z117
  • 201
  • 3
  • 12
4
votes
3 answers

Error connecting to amazon redshift from R - windows 10

When using the JDBC function to connect R to amazon redshift (I'm using windows 10) I get the following error: Error in .jfindClass(as.character(driverClass)[1]) : class not found The code I'm running…
Viola
  • 43
  • 1
  • 4
1
2 3
13 14