0

I am running spark stand alone and starting pyspark from the command line so it opens an ipython notebook. Here is how I start pyspark:

PYSPARK_DRIVER_PYTHON=ipython PYSPARK_DRIVER_PYTHON_OPTS="notebook" /usr/local/src/spark/spark-1.6.1-bin-hadoop2.6/bin/pyspark

Ipython notebook opens in my browser and there is already an SQLContext sc that I can start using. However, I need to set a conf for the SparkContext.

conf = SparkConf().setAppName("Cloudant Spark")
conf.set("jsonstore.rdd.schemaSampleSize", -1)

sc = SparkContext(conf=conf)
sqlContext = SQLContext(sc)

However, sc already exists so it won't let me create another one. I tried an sc.stop() first but that gives me an error when I try to use the new sqlContext.

My questions are: 1. How can I set the conf? 2. Is there a better/different way of connecting an Ipython notebook to pyspark standalone?

mamdouh alramadan
  • 8,349
  • 6
  • 36
  • 53
webe3
  • 103
  • 1
  • 8

0 Answers0