I get a ClassNotFoundException
when I try connecting from Databricks on GCP:
Py4JJavaError: An error occurred while calling o1808.jdbc.
: com.microsoft.sqlserver.jdbc.SQLServerException: The TCP/IP connection to the host localhost, port 1433 has failed. Error: "java.lang.ClassNotFoundException: com.google.cloud.sql.sqlserver.SocketFactory. Verify the connection properties. Make sure that an instance of SQL Server is running on the host and accepting TCP/IP connections at the port. Make sure that TCP connections to the port are not blocked by a firewall.".
at com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDriverError(SQLServerException.java:234)
at com.microsoft.sqlserver.jdbc.SQLServerException.ConvertConnectExceptionToSQLServerException(SQLServerException.java:285)
at com.microsoft.sqlserver.jdbc.SocketFinder.findSocket(IOBuffer.java:2466)
at com.microsoft.sqlserver.jdbc.TDSChannel.open(IOBuffer.java:672)
at com.microsoft.sqlserver.jdbc.SQLServerConnection.connectHelper(SQLServerConnection.java:2747)
at com.microsoft.sqlserver.jdbc.SQLServerConnection.login(SQLServerConnection.java:2418)
at com.microsoft.sqlserver.jdbc.SQLServerConnection.connectInternal(SQLServerConnection.java:2265)
at com.microsoft.sqlserver.jdbc.SQLServerConnection.connect(SQLServerConnection.java:1291)
at com.microsoft.sqlserver.jdbc.SQLServerDriver.connect(SQLServerDriver.java:881)
at org.apache.spark.sql.execution.datasources.jdbc.connection.BasicConnectionProvider.getConnection(BasicConnectionProvider.scala:49)
at org.apache.spark.sql.execution.datasources.jdbc.connection.ConnectionProviderBase.create(ConnectionProvider.scala:94)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.$anonfun$createConnectionFactory$1(JdbcUtils.scala:63)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:56)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation$.getSchema(JDBCRelation.scala:226)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:35)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:390)
at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:444)
at org.apache.spark.sql.DataFrameReader.$anonfun$load$3(DataFrameReader.scala:400)
at scala.Option.getOrElse(Option.scala:189)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:400)
<<snip>>
I'm using a Python notebook to connect:
password = "..."
connection_name = "...:...:..." # copied from console
display(
spark.read.jdbc(f"jdbc:sqlserver://localhost;databaseName=avoidable_events;socketFactoryClass=com.google.cloud.sql.sqlserver.SocketFactory;user=sqlserver;socketFactoryConstructorArg={connection_name};password={password}", table="thedb.thetable")
)
Cluster is 9.1 LTS (includes Apache Spark 3.1.2, Scala 2.12)
Libraries installed:
com.google.cloud.sql:cloud-sql-connector-jdbc-sqlserver:1.6.1
via Mavencom.microsoft.sqlserver:mssql-jdbc:10.2.0.jre8
via Maven
How do I fix this error?