PySpark in jupyter notebook using spark-csv package

Asked Mar 18 '16 at 00:50

Active Mar 18 '16 at 00:50

Viewed 164 times

I am using spark 1.6.0 in local mode. I have created ipython pyspark profile so pyspark kernel will start in jupyter notebook. All this works correctly.

I want to use this package spark-csv inside of jupyter notebook. I tried to edit file ~/.ipython/profile_pyspark/startup/00-pyspark-setup.py and put --packages com.databricks:spark-csv_2.11:1.4.0 after pyspark-shell command, without success. Still getting this error message:

Py4JJavaError: An error occurred while calling o22.load.
: java.lang.ClassNotFoundException: Failed to find data source: com.databricks.spark.csv. Please find packages at http://spark-packages.org
I have tried also [this solution][2] and many others...none of them worked.

Do you have any suggestions?

asked Mar 18 '16 at 00:50

Matus Cimerman

An [answer here](http://stackoverflow.com/questions/33908156/how-to-load-jar-dependenices-in-ipython-notebook) does not solve my problem. That's the reason why I opened this one. – Matus Cimerman Mar 18 '16 at 07:09
`export SPARK_OPTS='--packages com.databricks:spark-csv_2.10:1.4.0'` – Emre Jun 30 '16 at 20:37

PySpark in jupyter notebook using spark-csv package

0 Answers0