I'm trying to build a recommender using Spark and just ran out of memory:
Exception in thread "dag-scheduler-event-loop" java.lang.OutOfMemoryError: Java heap space
I'd like to increase the memory available to Spark by modifying the spark.executor.memory
property, in PySpark, at runtime.
Is that possible? If so, how?
update
inspired by the link in @zero323's comment, I tried to delete and recreate the context in PySpark:
del sc
from pyspark import SparkConf, SparkContext
conf = (SparkConf().setMaster("http://hadoop01.woolford.io:7077").setAppName("recommender").set("spark.executor.memory", "2g"))
sc = SparkContext(conf = conf)
returned:
ValueError: Cannot run multiple SparkContexts at once;
That's weird, since:
>>> sc
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'sc' is not defined