-1

I am a beginner in Pyspark, trying to execute few lines of code in a Jupyter notebook. I have followed the instructions available(pretty old - https://changhsinlee.com/install-pyspark-windows-jupyter/) in the internet to configure Pyspark post installing Python-3.8.5, Java(jdk-16), spark-3.1.1-bin-hadoop2.7.

Below are the lines which got executed successfully post installation and throws exception after 'df.show()'.I have added all necessary environment variables. Please help me to resolve this.

pip install pyspark

pip install findspark

import findspark

findspark.init()

import pyspark

from pyspark.sql import SparkSession

spark=SparkSession.builder.getOrCreate()

df=spark.sql('''Hello''')

df.show() Exception

Added error in the comments section.

Note: I am a beginner in Python. Do not have java knowledge

NikRED
  • 1,175
  • 2
  • 21
  • 39
  • Exception # This SparkContext may be an existing one. --> 228 sc = SparkContext.getOrCreate(sparkConf) 229 # Do not update `SparkConf` for existing `SparkContext`, as it's shared 230 # by all sessions. – NikRED Mar 21 '21 at 14:20
  • Check this once : https://stackoverflow.com/questions/44502872/how-can-i-get-the-current-sparksession-in-any-place-of-the-codes/44504213 – Emad Mar 21 '21 at 16:20

1 Answers1

0

Had to change the Java version into Java 11. It works now.

NikRED
  • 1,175
  • 2
  • 21
  • 39