Spark exception 5063 in TensorFlow extended example code on GPU

Asked Jul 25 '20 at 17:35

Active Jul 25 '20 at 17:35

Viewed 77 times

I am trying to run an example code of TensorFlow extended at https://www.tensorflow.org/tfx/tutorials/transform/census on databricks GPU cluster.

My env:

7.1 ML Spark 3.0.0 Scala 2.12 GPU
python 3.7
tensorflow: Version: 2.1.1
tensorflow-transform==0.22.0
apache_beam==2.21.0

When I run

 transform_data(train, test, temp)

I got error:

 Exception: It appears that you are attempting to reference SparkContext from a broadcast variable, action, or transformation. SparkContext can only be used on the driver, not in code that it run on workers. For more information, see SPARK-5063

It seems that this is a known issue of RDD on Spark. https://issues.apache.org/jira/browse/SPARK-5063

I tried to search some solutions here, but none of them work for me. how to deal with error SPARK-5063 in spark

At the example code, I do not see where SparkContext is accessed from worker explicitly. It is called from Apache Beam ?

Thanks

asked Jul 25 '20 at 17:35

user3448011

1,469
1
17
39

can you show the complete code please? – UninformedUser Jul 26 '20 at 03:25
The complete code is at https://www.tensorflow.org/tfx/tutorials/transform/census – user3448011 Jul 26 '20 at 04:22

Spark exception 5063 in TensorFlow extended example code on GPU

0 Answers0