3

I use PySpark in my system. I got the warninig: context.py:79: FutureWarning: Deprecated in 3.0.0. Use SparkSession.builder.getOrCreate() instead.

my script:

 scSpark = SparkSession.builder.config("spark.driver.extraClassPath", "./mysql-connector-java-8.0.29.jar").getOrCreate()
    sqlContext = SQLContext(scSpark)

    jdbc_url = "jdbc:mysql://{0}:{1}/{2}".format(hostname, jdbcPort, dbname)
    connectionProperties = {
        "user": username,
        "password": password
    }
    #df=scSpark.read.jdbc(url=jdbc_url, table='bms_title', properties= connectionProperties)
    #df.show()


    df = scSpark.read.csv(data_file, header=True, sep=",", encoding='UTF-8').cache()
    df2 = df.first()

    df = df.exceptAll(scSpark.createDataFrame([df2]))

    df.createTempView("books")

    output = scSpark.sql('SELECT `Postgraduate Course` AS Postgraduate_Course FROM books'))

Why I got this warning as I have already used SparkSession.builder.getOrCreate() How could I correct this warning?

pyspark
  • 43
  • 1
  • 5

2 Answers2

0

try to change

sqlContext = SQLContext(scSpark)

to

sqlContext = scSpark.sparkContext

or even

sc = scSpark.sparkContext

SQLContext is deprecated. more details you can find here: Difference between SparkContext, JavaSparkContext, SQLContext, and SparkSession?

Russel FP
  • 125
  • 3
0

Try to change this one:

scSpark = SparkSession.builder.config("spark.driver.extraClassPath", 
"./mysql-connector-java-8.0.29.jar").getOrCreate()

to this:

scSpark = SparkSession.builder.config("spark.driver.extraClassPath", 
"./mysql-connector-java-8.0.29.jar").enableHiveSupport().getOrCreate()

.enableHiveSupport() should fix it.

It also happened to me when I had toPandas in my code when I needed to convert the pyspark df to pandas df. In such a case, I tried to import the data from the beginning as pandas and not pyspark instead of converting later.

Hadij
  • 3,661
  • 5
  • 26
  • 48