1

Trying to connect to Postgres using DBI for first time to use dbplyr.

Issue: I am trying to use dbplyr with my exiting odbc connection but dbplyr doesn't seems to work with that so trying to create a new connection using DBI but this dbi connection is giving an error. So I am looking to fix dbi connection.

  1. When I try to replicate my existing connection using DBI then connection doesn't work:

libs

library(odbc)
library(RODBC)

library(DBI)
library(dplyr)
library(dbplyr)
# from: https://stackoverflow.com/questions/59413904/reading-data-from-a-sql-server-in-rstudio-dplyr
# from: /help/library/DBI/html/dbConnect.html
# from: https://github.com/r-dbi/odbc#odbc

dbicon <-  DBI::dbConnect( 
               odbc::odbc(),
               driver = "PostgreSQL Unicode", 
               database = "Postgres_xyz_db", 
               host = "some_xyz.amazonaws.com", 
               port = "5432", 
               uid = "user",
               pwd = "users_password")

# connect with table
tbl_qry <- tbl(dbicon, "mydb_demo.demo_table")

tbl_qry %>% head()

Error with dbConnect step:

Error: nanodbc/nanodbc.cpp:1021: 00000: could not connect to server: No such file or directory
    Is the server running locally and accepting
    connections on Unix domain socket "/var/run/postgresql/.s.PGSQL.5432"?
 
In addition: Warning message:
In for (i in seq_len(n)) { : closing unused RODBC handle 2

PS: I am running this on localhost:8787

  1. My Existing odbc / RODBC connection which is working for the same db.

libs

library(odbc)
library(RODBC)

library(dplyr)
library(dbplyr)

Existing Working connection:

## #################################################################
##                         Connection Para
## #################################################################

driver.name <- "PostgreSQL Unicode"
db.name <- "Postgres_xyz_db"
host.name <- "some_xyz.amazonaws.com" 
port <-"5432"
user.name <-"user"
pwd <- "users_password"

## #################################################################
##                      connect to a database
## #################################################################

con.text <- paste("DRIVER=",driver.name,
                  ";Database=",db.name,
                  ";Server=",host.name,
                  ";Port=",port,
                  ";PROTOCOL=TCPIP",
                  ";UID=", user.name,
                  ";PWD=",pwd,sep="")

con1 <- odbcDriverConnect(con.text)

But dbplyr doesn't work with this connection

# connect with table
tbl_qry <- tbl(con1, "mydb_demo.demo_table")

tbl_qry %>% head() 

(I am not really a techy or db admin or devops guy so pls excuse if it looks kind of filled with basic mistakes).

ViSa
  • 1,563
  • 8
  • 30
  • You use `"some_xyz.amazonaws.com"` but then say that you're *"running this on localhost:8787"*. Which is it? – r2evans Oct 15 '21 at 10:01
  • @r2evans: I am using `Rstudio` by `ssh tunneling` so I get `localhost:8787` & the actual host name is like `sandbox......some ...xyz ...amazon.aws.com`. And by using the same creds my odbc connection in the 2nd pointer works. – ViSa Oct 15 '21 at 11:54
  • If you are connecting through a tunnel, then you should never use the real hostname, always your local interface (i.e., `"localhost"`). The reason to port-forward through a tunnel is often because you cannot otherwise reach it directly, so using `host="some_xyz.amazonaws.com"` seems like it is known to not work. – r2evans Oct 15 '21 at 11:59
  • But when I changed to `host = "localhost"` I still get this error `Error: nanodbc/nanodbc.cpp:1021: 00000: could not connect to server: No such file or directory Is the server running locally and accepting connections on Unix domain socket "/var/run/postgresql/.s.PGSQL.5432"?` – ViSa Oct 15 '21 at 12:06
  • 1
    Try `server="localhost"` instead of `host="..."`; `odbc` prefers `server=`, I believe. – r2evans Oct 15 '21 at 12:09
  • Actually `server = "localhost` didn't work but `server = sandbox .. some ..xyz ...amazonaws.com` worked. so I needed to replace `host` with server. If you can add that in answer then I will accept it. But I am also facing a issue now: `tbl_qry <- tbl(con1, "mydb_demo.demo_table")` is now working. Although I can see `mydb_demo` in connections tab in Rstudio. – ViSa Oct 15 '21 at 12:20
  • 1
    for your other error, consider something like `tbl(con1, `[`in_schema`](https://dbplyr.tidyverse.org/reference/in_schema.html)`("mydb_demo", "demo_table"))`. – r2evans Oct 15 '21 at 12:25
  • 1
    Yes this worked well. Highly Appreciate your help. You are amazing dude !!! – ViSa Oct 15 '21 at 12:28

1 Answers1

2

odbc connections prefer server= over host=, so your connection attempt should likely be

dbicon <-  DBI::dbConnect( 
               odbc::odbc(),
               driver = "PostgreSQL Unicode", 
               database = "Postgres_xyz_db", 
               server = "some_xyz.amazonaws.com", 
               port = "5432", 
               uid = "user",
               pwd = "users_password")
r2evans
  • 141,215
  • 6
  • 77
  • 149
  • 1
    Thanks @r2evans for helping me on this. Was stuck on this for quite some time !! – ViSa Oct 15 '21 at 12:25
  • BTW, related question I asked some time ago: https://stackoverflow.com/questions/61825087/how-to-find-odbc-driver-specific-arguments, and the still-open FR at `nanodbc` that would help inform this programmatically and interactively: https://github.com/nanodbc/nanodbc/issues/235. – r2evans Oct 15 '21 at 13:13
  • 1
    yes, thankyou for sharing these links... though I didn't understand much about it but its a good start for me as well :) – ViSa Oct 16 '21 at 06:39
  • It seems when using `server` odbc will attempt to connect to the database over tcp, whereas with `host` it will attempt to connect to the db via unix sockets. In my case my db was only accepting requests via tcp. So this solution fixed it for me. Thanks! – Lingster Apr 26 '22 at 07:44