0

I've added the SQL Database and Spark into my application and have successfully imported data into the database.

Now I'm trying to load this data into Spark for processing using JDBC. I've connected the database via the Spark Data tab and have imported it into Spark as a source via Data Sources .The database gives me the following SSL String from the "Connect Applications" tab

jdbc:db2://75.126.155.153:50001/SQLDB:securityMechanism=9

or more specifically

which I've tried to connect to with Spark (written in Scala)

val sqlContext = new org.apache.spark.sql.SQLContext(sc)
val driver = "com.ibm.db2.jcc.DB2Driver"
Class.forName(driver)
val data = sqlContext.load("jdbc", Map( "url" -> "  jdbc:db2://75.126.155.153:50001/SQLDB:securityMechanism=9",  "dbtable" -> "HAWAII"))

however, I get the following error

Name: java.sql.SQLException
Message: No suitable driver found for   jdbc:db2://75.126.155.153:50001/SQLDB:securityMechanism=9
StackTrace: java.sql.DriverManager.getConnection(DriverManager.java:608)
java.sql.DriverManager.getConnection(DriverManager.java:199)
org.apache.spark.sql.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:118)
org.apache.spark.sql.jdbc.JDBCRelation.<init>(JDBCRelation.scala:128)
org.apache.spark.sql.jdbc.DefaultSource.createRelation(JDBCRelation.scala:113)
org.apache.spark.sql.sources.ResolvedDataSource$.apply(ddl.scala:269)
org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:114)
org.apache.spark.sql.SQLContext.load(SQLContext.scala:1253)
$line18.$read$$iwC$$iwC$$iwC$$iwC.<init>(<console>:16)
$line18.$read$$iwC$$iwC$$iwC.<init>(<console>:21)
$line18.$read$$iwC$$iwC.<init>(<console>:23)
$line18.$read$$iwC.<init>(<console>:25)
$line18.$read.<init>(<console>:27)
$line18.$read$.<init>(<console>:31)
$line18.$read$.<clinit>(<console>)
java.lang.J9VMInternals.initializeImpl(Native Method)
java.lang.J9VMInternals.initialize(J9VMInternals.java:235)
$line18.$eval$.<init>(<console>:7)
$line18.$eval$.<clinit>(<console>)
java.lang.J9VMInternals.initializeImpl(Native Method)
java.lang.J9VMInternals.initialize(J9VMInternals.java:235)
$line18.$eval.$print(<console>)
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:95)
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:56)
java.lang.reflect.Method.invoke(Method.java:620)
org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065)
org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1338)
org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840)
org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871)
org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819)
com.ibm.spark.interpreter.ScalaInterpreter$$anonfun$interpretAddTask$1$$anonfun$apply$3.apply(ScalaInterpreter.scala:296)
com.ibm.spark.interpreter.ScalaInterpreter$$anonfun$interpretAddTask$1$$anonfun$apply$3.apply(ScalaInterpreter.scala:291)
com.ibm.spark.global.StreamState$.withStreams(StreamState.scala:80)
com.ibm.spark.interpreter.ScalaInterpreter$$anonfun$interpretAddTask$1.apply(ScalaInterpreter.scala:290)
com.ibm.spark.interpreter.ScalaInterpreter$$anonfun$interpretAddTask$1.apply(ScalaInterpreter.scala:290)
com.ibm.spark.utils.TaskManager$$anonfun$add$2$$anon$1.run(TaskManager.scala:123)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1157)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:627)
java.lang.Thread.run(Thread.java:801)

I've searched this error and this answer says that I need to import the driver jar, however I have tried this with

%AddJar

However it gives the same error. Any ideas?

Community
  • 1
  • 1
Chris
  • 36
  • 2
  • Assuming there is nothing wrong with your `%addJar` and driver version itself try adding driver name to the properties map - `Map( "url" -> " ... "driver" -> driver, "dbtable" -> "HAWAII")` – zero323 Jan 28 '16 at 17:43
  • I added the driver argument and no dice. However, I searched a little more and found [this](http://stackoverflow.com/questions/34896505/connecting-to-a-postgresql-db-using-jdbc-from-the-bluemix-apache-spark-service?rq=1). Thanks for suggestion though! – Chris Jan 28 '16 at 18:17

2 Answers2

1

You would need to supply driver in the load method. Can you please try this?

val sqlContext = new org.apache.spark.sql.SQLContext(sc)

 val tmpdata1= sqlContext.load("jdbc", Map(
"url" -> "jdbc:db2://75.126.155.153:50000/SQLDB:securityMechanism=9;currentSchema=USER13878;user=<ur-username>;password=xxxxx;",
"driver" -> "com.ibm.db2.jcc.DB2Driver",
"dbtable" -> "USER13878.MYTABLE"))

Thanks, Charles.

charles gomes
  • 2,145
  • 10
  • 15
1

Finally got it...I believe I was using a deprecated version of the Spark load() function. This API leads me to think it's outdated ...especially when the notebook says its version is Spark 1.4.

Answer

Looking here I got a very detailed answer to the code structure. The code should be

val url = "jdbc:db2://75.126.155.153:50000/SQLDB"
val prop = new java.util.Properties
prop.setProperty("user","username")
prop.setProperty("password","xxxxxx")

val test = sqlContext.read.jdbc(url,"HAWAII",prop)
Chris
  • 36
  • 2