4

I'm creating a new job in databricks using the databricks-cli:

databricks jobs create --json-file ./deploy/databricks/config/job.config.json

With the following json:

{
    "name": "Job Name",
    "new_cluster": {
        "spark_version": "4.1.x-scala2.11",
        "node_type_id": "Standard_D3_v2",
        "num_workers": 3,
        "spark_env_vars": {
            "PYSPARK_PYTHON": "/databricks/python3/bin/python3"
        }
    },
    "libraries": [
        {
            "maven": {
                "coordinates": "com.microsoft.sqlserver:mssql-jdbc:6.5.3.jre8-preview"
            }
        }
    ],
    "timeout_seconds": 3600,
    "max_retries": 3,
    "schedule": {
        "quartz_cron_expression": "0 0 22 ? * *",
        "timezone_id": "Israel"
    },
    "notebook_task": {
        "notebook_path": "/notebooks/python_notebook"
    }
}

And I want to add parameters that will be accessible in the notebook via:

dbutils.widgets.text("argument1", "<default value>")
dbutils.widgets.get("argument1")
Alex Ott
  • 80,552
  • 8
  • 87
  • 132
Mor Shemesh
  • 2,689
  • 1
  • 24
  • 36

1 Answers1

10

Found the answer after a bit of tweaking, you can simply expand the notebook_task property to include base_parameters as follows:

{
    "notebook_task": {
        "notebook_path": "/social/04_batch_trends",
        "base_parameters": {           
            "argument1": "value 1",
            "argument2": "value 2"
        }
    }
}

This is documented in the Create method of the Jobs API. It lists the notebook_task parameter, which can be of the type NotebookTask.

Marco Roy
  • 4,004
  • 7
  • 34
  • 50
Mor Shemesh
  • 2,689
  • 1
  • 24
  • 36
  • 1
    [Here](https://docs.databricks.com/dev-tools/api/latest/jobs.html#jobsnotebooktask) is the documentation for the NotebookTask data structure, as documented in the [Create method](https://docs.databricks.com/dev-tools/api/latest/jobs.html#create) of the Jobs API. – Marco Roy Jan 21 '21 at 17:38