0

Below is my error message. When I use python 2.7 in Datastax Spark with the code below it doesn't work. I don't know why. Would be very grateful for some suggestions. Thanks

vi /etc/dse/spark/spark-env.sh
export PYTHONHOME=/usr/local
export PYTHONPATH=/usr/local/lib/python2.7
export PYSPARK_PYTHON=/usr/local/bin/python2.7

Error message:

Error from python worker:
  /usr/local/bin/python2.7: /usr/local/lib/python2.7/lib-dynload/_io.so: undefined symbol: _PyCodec_LookupTextEncoding
PYTHONPATH was:
  /usr/share/dse/spark/python/lib/pyspark.zip:/usr/share/dse/spark/python/lib/py4j-0.8.2.1-src.zip:/usr/share/dse/spark/lib/spark-core_2.10-1.4.2.2.jar:/usr/local/lib/python2.7
java.io.EOFException
        at java.io.DataInputStream.readInt(DataInputStream.java:392)
        at org.apache.spark.api.python.PythonWorkerFactory.startDaemon(PythonWorkerFactory.scala:163)
        at org.apache.spark.api.python.PythonWorkerFactory.createThroughDaemon(PythonWorkerFactory.scala:86)
        at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:62)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:130)
        at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:73)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
        at org.apache.spark.api.python.PairwiseRDD.compute(PythonRDD.scala:315)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
        at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:70)
        at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
        at org.apache.spark.scheduler.Task.run(Task.scala:70)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
peter
  • 674
  • 1
  • 12
  • 33
  • 1
    What is your specific python version 2.7.10? – RussS Mar 01 '16 at 18:52
  • 1
    Also check out http://stackoverflow.com/questions/27383054/python-importerror-usr-local-lib-python2-7-lib-dynload-io-so-undefined-symb – RussS Mar 01 '16 at 18:53
  • My version 2.7.11. Still have this problem no idea how to solve this. Any thoughts? – peter Mar 02 '16 at 01:15
  • 1
    The link I pasted above suggests you have multiple conflicting python versions. If you are using OSX and brew you could have a conflict between system python and the brew installed one – RussS Mar 02 '16 at 04:54
  • I am using centos and pip – peter Mar 02 '16 at 06:12
  • I am now using 2.6 which is not great, 2.7 doesn't work for some reason. – peter Mar 02 '16 at 10:49

0 Answers0