3

This is how I have successfully been reading data from an SQL server in Databricks Runtime 10.4, using the legacy JDBC driver:

remote_table = (
  spark.read.format("jdbc")
    .option("driver", "com.microsoft.sqlserver.jdbc.SQLServerDriver")
    .option("url", "jdbc:sqlserver://mydomainname.com;database=Runtime")
    .option("dbtable", "Runtime.dbo.History")
    .option("user", "myusername")
    .option("password", "mypassword")
    .load()
)

I want to use the "newer" approach with Runtime 12.2:

remote_table = (spark.read
  .format("sqlserver")
  .option("host", "mydomainname.com")
  .option("user", "myusername")
  .option("password", "mypassword")
  .option("database", "Runtime")
  .option("dbtable", "Runtime.dbo.History")
  .load()
)

But an error is returned:

com.microsoft.sqlserver.jdbc.SQLServerException: The driver could not establish a secure connection to SQL Server by using Secure Sockets Layer (SSL) encryption. Error: "PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target". ClientConnectionId:56f3b77d-a90a-4c89-98be-88924f364897

And yes, I am remembering to switch the cluster to one with Runtime 12.2.

Oliver Angelil
  • 1,099
  • 15
  • 31

1 Answers1

1

"sqlserver" connector is just a wrapper over JDBC and you would encounter same issue on runtime 12.x if you attempted legacy JDBC connection.

While Databricks runtime 10.x used SQL Server JDBC driver version 9.x, there was a breaking change in version 10.2 (which is used in 12.x runtime) that enabled TLS encryption by default and forced certificate validation. For this similar issue answered here. Extra info in SQL Server docs section Understanding encryption support.

In short, add following options:

.option("encrypt", "false")

or

.option("encrypt", "true")
.option("trustServerCertificate", "true")
Kombajn zbożowy
  • 8,755
  • 3
  • 28
  • 60
  • Both methods work. Is the second method (with "encrypt", "True"), safer? – Oliver Angelil Jul 16 '23 at 04:10
  • I'm not sure about case sensitivity here, but I've been always using lowercase for true/false here. This is also how it's written in docs. – Kombajn zbożowy Jul 16 '23 at 07:20
  • not asking about case sensitivity - rather, is the second approach with `.option("encrypt", "True")` recommended over the first? – Oliver Angelil Jul 16 '23 at 07:22
  • Sorry, the pending edit confused me. Yes, enabled encryption, even blindly trusting server certificate, can be considered more secure. And actually most secure option would be to import proper certificate and use it (if I learn how I'll add :D) – Kombajn zbożowy Jul 16 '23 at 07:31
  • You can ignore the pending edit. Thanks for your help! – Oliver Angelil Jul 16 '23 at 08:18