7

I am using a Spark Databricks cluster and want to add a customized Spark configuration.
There is a Databricks documentation on this but I am not getting any clue how and what changes I should make. Can someone pls share the example to configure the Databricks cluster.
Is there any way to see the default configuration for Spark in the Databricks cluster.

Stark
  • 604
  • 3
  • 11
  • 30
  • I have yet to see any documentation of the databrick specific config options. Hopefully someone can chime in with that documentation. – Foxhound013 Jan 25 '23 at 15:16

2 Answers2

2
  1. You can set cluster config in the compute section in your Databricks workspace. Go to compute (and select cluster) > configuration > advanced options: CLuster config under advanced options

  2. Or, you can set configs via a notebook.

    %python spark.conf.set("spark.sql.name-of-property", value)

Joey Gomes
  • 66
  • 3
0

You have many ways to set up the default cluster configs:

  1. Manually in the "compute" tab (as mentioned before): Go to Compute > Select a cluster > Advanced Options > Spark sparkexample

  2. Via notebook (as mentioned before): In a cell of your databricks notebook, you can set any spark configuration for that session/job by running the "spark.conf.set" command like spark.conf.set("spark.executor.memory","4g")

  3. Using JOB CLI API: If you are aiming to deploy jobs programmatically in a multi-environment fashion (e.g. Dev, Staging, Production): databricksjobapi

Useful links!