Papermill PySpark support

Question

I'm looking for a way to easily execute parametrized run of Jupyter Notebooks, and I've found Papermill Project (https://github.com/nteract/papermill/)

This tool seems to match my requirements, but I can't find any reference for PySpark kernel support.

Is PySpark kernels supported by papermill executions?

If it is, there is some configuration to be done to connect it to the Spark cluster used by Jupyter?

Thanks in advance for the support, Mattia

Did you got the answer? I'm looking for same solution. – Rajesh Dec 17 '19 at 14:26 — Rajesh, Dec 17 '19 at 14:26

score 0 · Answer 1 · answered Apr 25 '19 at 06:19

Papermill will work with PySpark kernels, so long as they implement Jupyter's kernel spec.

Configuring your kernel will depend on the kernel in question. Usually these read from spark.conf and/or spark.properties files to configure cluster and launch-time settings for Spark.

Papermill PySpark support

1 Answers1