5

I'm looking for a way to easily execute parametrized run of Jupyter Notebooks, and I've found Papermill Project (https://github.com/nteract/papermill/)

This tool seems to match my requirements, but I can't find any reference for PySpark kernel support.

Is PySpark kernels supported by papermill executions?

If it is, there is some configuration to be done to connect it to the Spark cluster used by Jupyter?

Thanks in advance for the support, Mattia

gogasca
  • 9,283
  • 6
  • 80
  • 125

1 Answers1

0

Papermill will work with PySpark kernels, so long as they implement Jupyter's kernel spec.

Configuring your kernel will depend on the kernel in question. Usually these read from spark.conf and/or spark.properties files to configure cluster and launch-time settings for Spark.

Pyrce
  • 8,296
  • 3
  • 31
  • 46