4

Using databricks-connect, I am able to run spark-code on a cluster. The official documentation (https://learn.microsoft.com/en-us/azure/databricks/dev-tools/databricks-connect) also only mentions spark-code. If I execute 'normal' python code, it does not run on Databricks, but in my local environment.

When working in databricks notebooks in the browser, I can also run 'normal' python code, which is executed on the driver node, as far as I know.

Is there a way to connect an external IDE (e.g. PyCharm) to Databricks, such that all code is executed on the cluster, as if I was working within Databricks in a notebook?

Edit: To make it more clear, I know how to connect PyCharm to databricks using databricks connect and I can run pyspark code in that way. What I want to do is run non-spark Code (e.g. train a sklearn model on some data after transforming the spark-dataframe to a pandas-dataframe) on databricks. To my understanding with databricks-connect, all non-spark code will run on my local machine. However, within databricks-notebooks it runs on the driver and am looking for a way to do this using databricks-connect.

prozaxx
  • 43
  • 3

0 Answers0