3

Everytime I try to run following Scala command

val dataRDD =  sc.textFile("hdfs://quickstart.cloudera:8020/user/cloudera/data/data.txt")
    dataRDD.collect().foreach(println)
    //or
    dataRDD.count()

I get following exception -

exitCodeException exitCode=1:   File "/etc/hadoop/conf.cloudera.yarn/topology.py", line 43 print default_rack^
SyntaxError: Missing parentheses in call to 'print'

-I am running Spark 1.6.0 on Cloudera VM. Anyone else faced such issue? What can be the reason? I understand that this is due to the 'topology.py' file which is trying to print without "(" which is required on python 3. But Why is this script being excuted when I am not running python/pyspark. This is only happening through Cloudera VM, when I run outside the vm with some other sample data, the commands work!

vks2106
  • 316
  • 4
  • 16
gupi_bagha
  • 31
  • 4

1 Answers1

1

I know it might be too late but I am posting the answer any way in case any other user face the same issue.

Above is the known issue and the workaround is following:

Workaround: Add a YARN gateway role to each host that does not already have at least one YARN role (of any type). YARN gateway needs to be added on the node/host where you are facing this issue.