0

I would like to know the PySpark equivalent of the following code in Scala. I am using databricks. I need the same output as below:-

to create new Spark session and output the session id (SparkSession@123d0e8)

val new_spark = spark.newSession()

**Output** 
new_spark: org.apache.spark.sql.SparkSession = org.apache.spark.sql.SparkSession@123d0e8

to view SparkContext and output the SparkContext id (SparkContext@2dsdas33)

new_spark.sparkContext
**Output** 
org.apache.spark.SparkContext = org.apache.spark.SparkContext@2dsdas33

  • Refer - https://stackoverflow.com/questions/39780792/how-to-build-a-sparksession-in-spark-2-0-using-pyspark – hagarwal Aug 11 '20 at 17:20

2 Answers2

0

SparkSession could be created as http://spark.apache.org/docs/2.0.0/api/python/pyspark.sql.html

>>> from pyspark.sql import SparkSession
>>> from pyspark.conf import SparkConf
>>> SparkSession.builder.config(conf=SparkConf())

or

>>> from pyspark.sql import SparkSession
>>> spark = SparkSession.builder.appName('FirstSparkApp').getOrCreate()
hagarwal
  • 1,153
  • 11
  • 27
0

It's very similar. If you have already a session and want to open another one, you can use

my_session = spark.newSession()

print(my_session)

This will produce the new session object I think you are trying to create

<pyspark.sql.session.SparkSession object at 0x7fc3bae3f550>

spark is a session object already running, because you are using a databricks notebook

Oscar Lopez M.
  • 585
  • 3
  • 11