Hello I am having difficulty connecting to a SQL server database that a work colleague set up for me. I mostly run R/Rstudio on a Linux Centos 6.6 machine, but sometimes on AWS EC2 instances and my local Windows PC. In the past I have connected to AWS Redshift that was set up by someone else and I was able to establish connections to it using “src_postgres” dplyr function. I would like to make a connection like the “dplyr” option if possible so I can reuse some of the code I developed to work with those tables. The person who created the SQL server database created a username, password, and host name (***.net). My colleague who uses windows and SAS was able to use his Windows username/password that we use to log into our PCs at work. Can I use that username/password too since it looks like it is an option for me on Linux too, or do I have to use the specific one that was created for me just for SQL server?
I tried to do some research on what the best way to do this and below is how I think I should do this along with some findings. It looks like best option is to use RSQLServer (https://github.com/imanuelcostigan/RSQLServer) but I am open to other suggestions (using RODBC via https://support.rstudio.com/hc/en-us/articles/214510788-Setting-up-R-to-connect-to-SQL-Server-). It looks like I have to download/install some items (e.g., drivers, sql.yaml) before I am able to do this. First I think I need to install the correct SQL server driver for my Centos system (and later for EC2 instances). For the Centos system I use most of the time can/should I use the “Red Hat” drivers, since I can’t seem to find one for Centos? I also wonder if I need to install an “authentication driver” if I want to use my Windows login credentials that I use for work (Do I use this http://jtds.sourceforge.net/ or this https://msdn.microsoft.com/en-us/library/hh568450(v=sql.110).aspx?). Furthermore, is there a “unixODBC” driver that I need to install as well (https://msdn.microsoft.com/en-us/library/hh568449(v=sql.110).aspx)? Once I get these drivers installed (any others?) then I need to create a “sql.yaml” file to serve my server details. However, I am not sure how to create this file and where it should be located (e.g., use Notepad++ or to create file and just place file inside working directory)? It looks like I would create a separate entry within that file for the SQL server that I am using. I created a “sql.yml” file that I copied directly from here (https://github.com/imanuelcostigan/RSQLServer) and placed in within the working directory. When I tried to run the example in Rstudio I get the error below.
aw <- RSQLServer::src_sqlserver("AW", database = "AdventureWorks2012")
Error in rJava::.jcall(drv@jdrv, "Ljava/sql/Connection;", "connect", url, :
java.sql.SQLException: Unknown server host name 'AW'.
I also tried to “odbcDriverConnect” R function after trying to install the ODBC connection on that server, but got the following error.
dbConnect(RSQLServer::SQLServer(), server="****", username="****",password="****", database = "****")
[RODBC] ERROR: state IM002, code 0, message [unixODBC][Driver Manager]Data source name not found, and no default driver specified
I am not sure if the sql.yaml file is correct or if the drivers are the problem, and I am not sure what do try next… My linux IT skills to do this are limited, but I can follow instructions... :) I was wondering if anyone could provide details on what I need to install and setup to get this working (e.g., what to type into command line). I suspect that I don’t have the appropriate drivers installed and I am not sure which to try and what the appropriate commands are (e.g., those from jTDS, from Microsoft, etc.?). Thanks in advance for your assistance!
UPDATE
Thanks Valentin! I am able to connect that way on my local windows PC using an ODBC connection, but was not able to get it working with the RSQLServer R library function on windows. I confirmed that I connect using R with windows OS with both using the trusted user option and the username and password set up on SQL server. I am also able connect to the database using a JDBC connection with Rstudio server (see below for what worked).
drv <- JDBC(
driverClass = "net.sourceforge.jtds.jdbc.Driver",
classPath = "/**** /RSQLServer/java/jtds-1.2.8.jar",
identifier.quote="`")
conn <- dbConnect(drv,
"jdbc:jtds:sqlserver://****.net/DBTable",
"userid",
"password")
My problem is that I can't connect using the Rstudio linux server using an ODBC connection (ODBC (maybe FreeTDS driver?) and/or the RSQLServer R library (maybe need to use jTDS and register it?). I would like to figure out how to use "RSQLServer" R library so I can utilize the dplyr backend connection, so I would like to get that option figured out.
I created the suggested sql.yaml file with the following information (https://github.com/imanuelcostigan/RSQLServer) and placed it within my R working directory. However, when I try to run follow the example on the RSQLServer github site that attempts to connect to this SQL server dataset (http://sqlblog.com/blogs/jamie_thomson/archive/2012/03/27/adventureworks2012-now-available-to-all-on-sql-azure.aspx) I can the following errors:
#using driver specified above
aw <- dbConnect(drv, "AW", database = 'AdventureWorks2012')
Error in .verify.JDBC.result(jc, "Unable to connect JDBC to ", url) :
Unable to connect JDBC to AW
#trying to use the connection specifid in the sql.yaml file
aw <- RSQLServer::src_sqlserver("AW", database = "AdventureWorks2012")
Error in rJava::.jcall(drv@jdrv, "Ljava/sql/Connection;", "connect", url, :
java.sql.SQLException: Unknown server host name 'AW'.
I think it the jTDS driver is not set up correctly, or I am not doing something correct when creating the sql.yaml file within the R working directory (should it be placed somewhere else?). Thanks again for any suggestions!