I have Apache Spark installed on ubuntu at this path /home/mymachine/spark-2.1.0-bin-hadoop2.7
so I have to go to python directory, located under this directory, to be able using spark OR I can use it outside python directory with help from a library called findspark, however it seems I have to always init this library like this:
import findspark
findspark.init("/home/mymachine/spark-2.1.0-bin-hadoop2.7")
everytime I want to use findspark
, which is not very effective. Is there anyway to init this library permanently?
At here it mentioned need to set a variable SPARK_HOME
on .bash_profile and I did it, but no luck.