2

two questions:

  1. how to run python3 in spark module? I run /bin/.pyspark and it automatically runs Python 2.7. How to run Python3?
  2. After I run pyspark, it pops a warning like this: 16/12/29 17:33:37 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Does it mean I downloaded the wrong spark platform?

I am using MacProBook. Thanks.

Chenxi Zeng
  • 367
  • 1
  • 4
  • 11

2 Answers2

3

Follow these steps for:

1 time:

PYSPARK_PYTHON=python3 ./bin/pyspark

Everytime:

>>>cd
>>>vim .bashrc

Add these 2 lines at the end of file and save the file.

export PYSPARK_PYTHON=/usr/bin/python3
export PYSPARK_DRIVER_PYTHON=python3

After exiting from the file, source the .bashrc file to reflect changes.

>>>source .bashrc

Now when you start spark, it will use Python3.


Read this for your 2nd error. It has got to do with 32bit vs 64bit source code compilation:

Hadoop "Unable to load native-hadoop library for your platform" warning

Community
  • 1
  • 1
Mohammad Yusuf
  • 16,554
  • 10
  • 50
  • 78
0

add this in your ~/.bashrc `

export HADOOP_OPTS="$HADOOP_OPTS -Djava.library.path=/usr/local/hadoop/lib/"

export HADOOP_COMMON_LIB_NATIVE_DIR="/usr/local/hadoop/lib/native/"

or : export HADOOP_OPTS="$HADOOP_OPTS -Djava.library.path=/usr/local/hadoop/lib/native"

majdouline
  • 11
  • 5