I have a DataBricks notebook (Spark - python) that reads from S3 and after doing some ETL work, writes results to S3. Now I want to run this code on a schedule as a .py
script, not from a notebook. The reason I am looking to run a python script is that it makes the versioning easier
I understand I need to create a job in Databricks that runs on a schedule. But looks like a Databricks job can only run a JAR
(scala) or a notebook
. I don't see a way to run a python script there.
Am I missing something?