1

I am trying to run pyspark with Jupyter Lab options (inline) as follows.

PYSPARK_DRIVER_PYTHON="jupyter" PYSPARK_DRIVER_PYTHON_OPTS="notebook --NotebookApp.notebook_dir='/' --NotebookApp.port=4444" $SPARK_HOME/bin/pyspark

This approach inspired by the official documentation. However, when the command is executed, notebooks directory is served from /root and the port is still 8888.

I also wrapped the execution in a .sh (shell) file as follows.

#!/bin/bash

export PYSPARK_DRIVER_PYTHON=jupyter
export PYSPARK_DRIVER_PYTHON_OPTS="notebook --NotebookApp.notebook_dir='/' --NotebookApp.port=4444"

pyspark "$@"

But this attempt to override the notebook directory and port also does not work. I have checked to make sure that /root/.jupyter/jupyter_notebook_config.py does not exists.

Any ideas on what is wrong here?

Jane Wayne
  • 8,205
  • 17
  • 75
  • 120

1 Answers1

0

Nevermind, I followed this post and had hard-coded the environment variables in spark-env.sh. After removing those variables, everything works.

Jane Wayne
  • 8,205
  • 17
  • 75
  • 120