3

I would like to connect to DynamoDB with R. My ultimate goal is to create a Shiny App to display data that is stored at DynamoDB and updated frequently. So I need an efficient way to retrieve it using R.

The following references give an intuition but they do not include a native implementation in R and have not been updated for a long time.

r language support for AWS DynamoDB
AWS dynamodb support for "R" programming language
R + httr and EC2 api authentication issues

As mentioned in the answers above, running Python within R through rPython would be an option as there are SDKs for Python such as boto3.

Another alternative would be using a JDBC driver through RJDBC, which I tried:

library(RJDBC)

drv <- JDBC(
  driverClass = "cdata.jdbc.dynamodb.DynamoDBDriver", 
  classPath = "MyInstallationDir\lib\cdata.jdbc.dynamodb.jar", 
  identifier.quote = "'"
)

conn <- dbConnect(
  drv,
  "Access Key=xxx;Secret Key=xxx;Domain=amazonaws.com;Region=OREGON;"
) 

(Access Key and Secret Key replaced by xxx) and I got the error:

Error in .verify.JDBC.result(jc, "Unable to connect JDBC to ", url) : 
  Unable to connect JDBC to Access Key=xxx;Secret 
Key=xxx;Domain=amazonaws.com;Region=OREGON;

What would be the best practice in this matter? Is there a working, native solution for R? I would appreciate if anyone could point me in the right direction.

Note: The package aws.dynamodb (https://github.com/cloudyr/aws.dynamodb) looks promising but the documentation lacks examples and I could not find any tutorial for it.

OzanStats
  • 2,756
  • 1
  • 13
  • 26
  • Have you tried the AWS API documentation? Your can interact with Dynamo using simple HTTP calls, a high level SDK for python or R is not a requirement... https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_Operations_Amazon_DynamoDB.html – Renato Byrro Jun 21 '18 at 11:22
  • @RenatoByrro; I tried but found it very painful. I don't have much experience with HTTP calls. You are very welcome to post an example solution below. – OzanStats Jul 28 '18 at 06:30
  • I'm not familiar with R. Connecting to external DBs would be something trivial in Python, and its data science libraries are very powerful. Chances are whatever you're trying to do in R, there's a way to do in Python as well. Isn't it an option for your use case to rely only on Python instead of R? – Renato Byrro Jul 28 '18 at 12:53
  • R has very powerful tools for relational databases as well but not for NoSQL ones. I ended up benefiting from `boto3` to pull the data into R as I am building a Shiny App. – OzanStats Jul 28 '18 at 18:19
  • You may want to have a peak at the following package: https://github.com/Appsilon/shiny.collections. It would be nice to have something like this available based on a more mature database platform than ReThinkDB. – Jochem Jan 03 '19 at 14:58

1 Answers1

2

I would like to share some updates so that people with the same issue can benefit from this post:

First, I figured out how to use the JDBC driver with a few tweaks:

library(DBI)
library(RJDBC)

drv <- JDBC(
  driverClass = "cdata.jdbc.dynamodb.DynamoDBDriver", 
  classPath = "/Applications/CData/CData JDBC Driver for DynamoDB 2018/lib/cdata.jdbc.dynamodb.jar", 
  identifier.quote = "'"
)

conn <- dbConnect(
  drv, 
  url = 'jdbc:dynamodb: Access Key=xxx; SecretKey=xxx; Domain=amazonaws.com; Region=OREGON;'
)

dbListTables(conn)

Second, I realized that reticulate makes it very convenient (even more than rPython) to run Python code inside R and ended up using reticulated boto3 to get data from DynamoDB into R. You can refer to the following documentations for additional info:
reticulate
boto3 - DynamoDB

Last, I heard that RStudio is planning to build a NoSQL database driver (which would be compatible with DBI, dbplyr, pool etc.) but probably it won't be available sometime soon.

Hope someone will create an R package as comprehensive as boto3 for AWS as it gets more and more popular.

OzanStats
  • 2,756
  • 1
  • 13
  • 26