0

I am currently working on a Jupyter (Lab) and PySpark 2.1.1.

I want to change spark.yarn.queue and master from a notebook. Because of the kernel spark and sc are available when I open a notebook.

Following this question, I tried

spark.conf.set("spark.yarn.queue", "my_queue")

But according to spark.sparkContext.getConf() the above line has no affect.

spark.conf.setMaster("yarn-cluster")

is not working, because there is no such a method for spark.conf.

Question: How can I change the configuration (queue and master) from a Jupyter notebook?

(Or should I set any environment variables?)

H. Shindoh
  • 906
  • 9
  • 23

1 Answers1

1

You can try to initialize spark beforehand, not in the notebook. Run this on your terminal:

export PYSPARK_DRIVER_PYTHON=jupyter
export PYSPARK_DRIVER_PYTHON_OPTS='notebook'

pyspark --master <your master> --conf <your configuration> <or any other option that pyspark supports>.

My source

martinarroyo
  • 9,389
  • 3
  • 38
  • 75