I am new to SPARK and trying to use it in windows. I was able to successfully download and install Spark 1.4.1 using pre-build version with hadoop. In the following directory:
/my/spark/directory/bin
I can run the spark-shell and pyspark.cmd and everything works fine. The only problem I am dealing with is that I want to import pyspark while I am coding in Pycharm. Right now I am using the following code to make things work:
import sys
import os
from operator import add
os.environ['SPARK_HOME'] = "C:\spark-1.4.1-bin-hadoop2.6"
sys.path.append("C:\spark-1.4.1-bin-hadoop2.6/python")
sys.path.append("C:\spark-1.4.1-bin-hadoop2.6/python/build")
try:
from pyspark import SparkContext
from pyspark import SparkConf
except ImportError as e:
print ("Error importing Spark Modules", e)
sys.exit(1)
I am wondering if there is an easier way for doing this. I am using Windows 8 - Python 3.4 and Spark 1.4.1