0

I have the following error IndexError: list index out of range in my jupyter notebook when I try to import

import findspark
findspark.init()

I know the answer already exist in this link, (so it is NOT a duplicate) however, when I ran which spark-shell the output is: /opt/anaconda3/bin/spark-shell which gives me the directory and so I ran:

import findspark
findspark.init('/opt/anaconda3/bin/spark-shell')

And I get the same error. How can I fix it? Thank you.

Chique_Code
  • 1,422
  • 3
  • 23
  • 49
  • try `findspark.init('/opt/anaconda3/bin/')` – Gaurang Shah Feb 12 '20 at 03:26
  • @GaurangShah Same error :/ – Chique_Code Feb 21 '20 at 16:43
  • @Chique_Code: Try using `whereis spark` or `locate spark` and see if they give the same location as `which spark-shell`. Also how did you install spark? Usually spark wouldn't be located in the anaconda3 directory, you can check if there is anything like `/opt/spark...`. – Shaido Feb 25 '20 at 06:25

1 Answers1

0

It wants spark home, not an executable. You should probably have an environment variable $SPARK_HOME, do you have it?

If not, try to run one of these:

ls -la /opt/anaconda3/bin/spark-shell
readlink -f /opt/anaconda3/bin/spark-shell

This will show you where the actual bin folder of spark home is. You just use part of that path int the init.

I think that if you set the SPARK_HOME later on, find spark runs without the need for the path specified.