I'm using spark 1.4.1. When i'm trying to broadcast random forest model it shows me this error:
Traceback (most recent call last):
File "/gpfs/haifa/home/d/a/davidbi/codeBook/Nice.py", line 358, in <module>
broadModel = sc.broadcast(model)
File "/opt/apache/spark-1.4.1-bin-hadoop2.4_doop/python/lib/pyspark.zip/pyspark/context.py", line 698, in broadcast
File "/opt/apache/spark-1.4.1-bin-hadoop2.4_doop/python/lib/pyspark.zip/pyspark/broadcast.py", line 70, in __init__
File "/opt/apache/spark-1.4.1-bin-hadoop2.4_doop/python/lib/pyspark.zip/pyspark/broadcast.py", line 78, in dump
File "/opt/apache/spark-1.4.1-bin-hadoop2.4_doop/python/lib/pyspark.zip/pyspark/context.py", line 252, in __getnewargs__
Exception: It appears that you are attempting to reference SparkContext from a broadcast variable, action, or transforamtion. SparkContext can only be used on the driver, not in code that it run on workers. For more information, see SPARK-5063.
example for the code i'm trying to execute:
sc = SparkContext(appName= "Something")
model = RandomForest.trainRegressor(sc.parallelize(data), categoricalFeaturesInfo=categorical, numTrees=100, featureSubsetStrategy="auto", impurity='variance', maxDepth=4)
broadModel= sc.broadcast(model)
If someone can help me with that i will be very thankful! Thanks a lot!