Questions tagged [databricks-connect]
172 questions
13
votes
2 answers
How to properly access dbutils in Scala when using Databricks Connect
I'm using Databricks Connect to run code in my Azure Databricks cluster locally from IntelliJ IDEA (Scala).
Everything works fine. I can connect, debug, inspect locally in the IDE.
I created a Databricks Job to run my custom app JAR, but it fails…

empz
- 11,509
- 16
- 65
- 106
11
votes
1 answer
Switching between Databricks Connect and local Spark environment
I am looking to use Databricks Connect for developing a pyspark pipeline. DBConnect is really awesome because I am able to run my code on the cluster where the actual data resides, so it's perfect for integration testing, but I also want to be able…

casparjespersen
- 3,460
- 5
- 38
- 63
9
votes
2 answers
Unable to make private java.nio.DirectByteBuffer(long,int) accessible
I'm using Python to access Databricks through databricks-connect. Behind the wall, this uses spark which is indeed java based so in order to use this, I need java. The JDK has been downloaded (version 14), set as JAVA_HOME env but when I run the…

anthino12
- 770
- 1
- 6
- 29
8
votes
1 answer
Running into 'java.lang.OutOfMemoryError: Java heap space' when using toPandas() and databricks connect
I'm trying to transform a pyspark dataframe of size [2734984 rows x 11 columns] to a pandas dataframe calling toPandas(). Whereas it is working totally fine (11 seconds) when using an Azure Databricks Notebook, I run into a…

petzholt
- 113
- 1
- 8
5
votes
0 answers
databricks-connect cannot load module in udf
I'm trying to load PyNaCl into a pyspark UDF running on Windows.
from nacl import bindings as c
def verify_signature(msg, keys):
c.crypto_sign_ed25519ph_update(...)
...
verify_signature_udf = udf(lambda x: verify_signature(x, public_keys),…

HeyMan
- 1,529
- 18
- 32
5
votes
1 answer
Execute databricks magic command from PyCharm IDE
With databricks-connect we can successfully run codes written in Databricks or Databricks notebook from many IDE. Databricks has also created many magic commands to support their feature with regards to running multi-language support in each cell by…

Rohit Mishra
- 441
- 4
- 17
4
votes
0 answers
Issues running PySpark UDF with Databricks Connect
I'm having problems running my PySpark UDFs in a distributed way, e.g. via Databricks Connect.
For example:
import pyspark.sql.functions as f
class MyClass(object):
def __init__(self, number_string):
self.number = int(number_string)
…

Kasia Kulma
- 1,683
- 1
- 14
- 39
4
votes
1 answer
Databricks Connect java.lang.ClassNotFoundException
I updated our databricks cluster to DBR 9.1 LTS on Azure Databricks, but a package I use regularly is giving me an error when I try to run it in VS Code with Databricks-connect, where it didn't with the previous cluster. The previous cluster was…

Louis
- 97
- 7
4
votes
0 answers
Databricks: Remote execution of non-spark code
Using databricks-connect, I am able to run spark-code on a cluster. The official documentation (https://learn.microsoft.com/en-us/azure/databricks/dev-tools/databricks-connect) also only mentions spark-code.
If I execute 'normal' python code, it…

prozaxx
- 43
- 3
4
votes
1 answer
Databricks connect to IntelliJ + python Error Exception in thread "main" java.lang.NoSuchMethodError:
I trying to connect my databricks with my IDE
I do not have spark ad/or scala downloaded on my machine, but I did download pyspark (pip install pyspark).
I consturcted the necessary environmental variables and made a folder Hadoop, in which I placed…

Rens
- 177
- 9
4
votes
1 answer
How to download an installed dbfs jar file from databricks cluster to local machine?
I am new to Databricks and I wish to download an installed library of a databricks cluster to my local machine. Could you please help me with that?
So to elaborate I already have a running cluster on which libraries are already installed. I need to…

jukebox
- 453
- 2
- 8
- 24
4
votes
1 answer
Error Connecting to Databricks from local machine
I am attempting to make a connection to Databricks from my Mac(Mojave).
I did a pip install -U databricks-connect==5.5.*
I start a spark-shell but when I try to query in spark I get the following error
Caused by:…

DataTx
- 1,839
- 3
- 26
- 49
4
votes
1 answer
Databricks connect fails with No FileSystem for scheme: abfss
I have setup Databricks Connect so that I can develop locally and get Intellij goodies while at the same time leverage the power of a big Spark cluster on Azure Databricks.
When I want to read or write to Azure Data…

zaxme
- 1,065
- 11
- 29
4
votes
3 answers
How can I connect Databricks Community Edition cluster from PyCharm
I want to work on some small exercise projects, I wish to use databricks cluster. Can this be done. I am hoping there is some way to connect databricks cluster through databricks-connect utility. Just need some steps. Thanks in advance.

Manish
- 1,144
- 8
- 12
3
votes
2 answers
Can not connect dbt cloud or dbt core to databricks
I am having issue connecting my dbt cloud and dbt core to databricks
I have read these 4 links, but still can not…

Julian Eccleshall
- 444
- 5
- 11