0

I'm trying to fetch table data as DataFrame using pyspark.

The code below works, but I am wondering can I specify the name of JDBC driver, by not using option function in the sparkSession.read()

ss = ( SparkSession.builder.appName(args.job_name)
            .config('spark.jars.packages', 'com.mysql:mysql-connector-j:8.0.31')
            .getOrCreate()
        )

return ss.read\
         .option("driver", "com.mysql.cj.jdbc.Driver")\
         .jdbc(url=connection_str, table=table_name, column="id", lowerBound=0, upperBound=row_cnt, numPartitions=3)

Can I specify --driver-class-path or via sparkSession config, or any other ways?

Edit:

I also tried using spark.driver.extraClassPath as How to specify driver class path when using pyspark within a jupyter notebook? says, but it didn't help.

mozzi
  • 63
  • 1
  • 10
  • Does this answer your question? [How to specify driver class path when using pyspark within a jupyter notebook?](https://stackoverflow.com/questions/51772350/how-to-specify-driver-class-path-when-using-pyspark-within-a-jupyter-notebook) – samkart Oct 21 '22 at 11:12
  • Sadly, no :( I still need to using ‘option’ to not get jdbc driver Exception – mozzi Oct 21 '22 at 12:05

0 Answers0