I am trying to use IPython notebook with Apache Spark 1.4.0. I have followed the 2 tutorial below to set my configuration
Installing Ipython notebook with pyspark 1.4 on AWS
and
Configuring IPython notebook support for Pyspark
After fisnish the configuration, following is several code in the related files:
1.ipython_notebook_config.py
c=get_config()
c.NotebookApp.ip = '*'
c.NotebookApp.open_browser =False
c.NotebookApp.port = 8193
2.00-pyspark-setup.py
import os
import sys
spark_home = os.environ.get('SPARK_HOME', None)
sys.path.insert(0, spark_home + "/python")
# Add the py4j to the path.
# You may need to change the version number to match your install
sys.path.insert(0, os.path.join(spark_home, 'python/lib/py4j-0.8.2.1-src.zip'))
# Initialize PySpark to predefine the SparkContext variable 'sc'
execfile(os.path.join(spark_home, 'python/pyspark/shell.py'))
I also add following two lines to my .bash_profile:
export SPARK_HOME='home/hadoop/sparl'
source ~/.bash_profile
However, when I run
ipython notebook --profile=pyspark
it shows the message: unrecognized alias '--profile=pyspark' it will probably have no effect
It seems that the notebook doesn't configure with pyspark successfully Does anyone know how to solve it? Thank you very much
following are some software version
ipython/Jupyter: 4.0.0
spark 1.4.0
AWS EMR: 4.0.0
python: 2.7.9
By the way I have read the following, but it doesn't work IPython notebook won't read the configuration file