I am running pyspark locally and had some issues due to something with the paths to python (when running python3 in command prompt I got an error, but when running python I would not. I have python 3 installed) I would get an java.io.IOException error when trying to run a pyspark job.
Now I have added
import os
import sys
os.environ['PYSPARK_PYTHON'] = sys.executable
os.environ['PYSPARK_DRIVER_PYTHON'] = sys.executable
which solves my problem. However, this does not seem like the best solution. Do I then in every file have to add this at the beginning or is there a smarter solution?