Apache spark JDBC connection read write driver missing

Question

Hi there are already many questions out there regarding this topic, the solution always was:

include driver via sbt-assembly
use spark-submit`s option --packages to load them on the fly

I set up a minimum example here: https://github.com/geoHeil/sparkJDBCHowTo, trying both methods but none worked for me. Getting java.sql.SQLException: No suitable driver

score 1 · Accepted Answer · edited May 23 '17 at 10:27

1

Here is the fix: Apache Spark : JDBC connection not working adding prop.put("driver", "org.postgresql.Driver") works fine.

The strange thing is, the connection does not seem to be stable e.g. with the hive-context it only works 1 out of 2 times.

edited May 23 '17 at 10:27

Community

1
1

answered Jun 10 '16 at 16:21

Georg Heiler

16,916
36
162
292

score 0 · Answer 2 · edited Oct 02 '16 at 13:26

That’s pretty straightforward. To connect to external database to retrieve data into Spark dataframes, an additional jar file is required.

E.g. with MySQL the JDBC driver is required. Download the driver package and extract mysql-connector-java-x.yy.zz-bin.jar in a path that’s accessible from every node in the cluster. Preferably this is a path on shared file system. E.g. with Pouta Virtual Cluster such path would be under /shared_data, here I use /shared_data/thirdparty_jars/.

With direct Spark job submissions from terminal one can specify –driver-class-path argument pointing to extra jars that should be provided to workers with the job. However this does not work with this approach, so we must configure these paths for front end and worker nodes in the spark-defaults.conf file, usually in /opt/spark/conf directory.

Place Any jar that depends upon what server you using in:

spark.driver.extraClassPath /"your-path"/mysql-connector-java-5.1.35-bin.jar

spark.executor.extraClassPath /"your-path"/mysql-connector-java-5.1.35-bin.jar

Apache spark JDBC connection read write driver missing

2 Answers2