I am trying to run PySpark on Windows with GraphFrames.
The GraphFrames QuickStart Guide mentions following -
If you have GraphFrames available as a JAR graphframes.jar, you can make GraphFrames available by passing the JAR to the pyspark shell script as follows:
$ ./bin/pyspark --master local[4] --py-files graphframes.jar --jars graphframes.jar
Is there a similar command (like --py-files
) to include the .jar distribution in Windows?
I tried using NotebookApp.file_to_run = "graphframes-0.2.0-spark1.5-s_2.10.jar", but that did not work. Is there some other way to run GraphFrames with PySpark on Windows? TIA.
What I run in command line to start PySpark:
ipython notebook %SPARK_HOME%/bin/pyspark
Final Command I tried to run GraphFrames:
ipython notebook %SPARK_HOME%/bin/pyspark NotebookApp.file_to_run=graphframes-0.2.0-spark1.5-s_2.10.jar