Questions tagged [azure-databricks]

For questions about the usage of Databricks Lakehouse Platform on Microsoft Azure

Overview

Azure Databricks is the Azure-based implementation of Databricks, which is a high-level platform for working with Apache Spark and includes Jupyter-style notebooks.

Azure Databricks is a first class Azure service and natively integrates with other Azure services such as Active Directory, Blob Storage, Cosmos DB, Data Lake Store, Event Hubs, HDInsight, Key Vault, Synapse Analytics, etc.

Related Tags

4095 questions

votes

5 answers

How to get the schema definition from a dataframe in PySpark?

In PySpark it you can define a schema and read data sources with this pre-defined schema, e. g.: Schema = StructType([ StructField("temperature", DoubleType(), True), StructField("temperature_unit", StringType(), True), …

asked Feb 03 '19 at 12:49

Hauke Mallow

2,887
3
11
29

votes

3 answers

How to list all the mount points in Azure Databricks?

I tried with this %fs ls dbfs:/mnt, but i want to know do this give me all the mount point?

scala databricks azure-databricks

asked Jun 05 '20 at 12:57

Shahid Ahmed

votes

6 answers

How to delete all files from folder with Databricks dbutils

Can someone let me know how to use the databricks dbutils to delete all files from a folder. I have tried the following but unfortunately, Databricks doesn't support…

databricks azure-databricks dbutils

asked Jan 07 '19 at 20:48

Carltonp

1,166
5
19
39

votes

3 answers

Parquet vs Delta format in Azure Data Lake Gen 2 store

I am importing fact and dimension tables from SQL Server to Azure Data Lake Gen 2. Should I save the data as "Parquet" or "Delta" if I am going to wrangle the tables to create a dataset useful for running ML models on Azure Databricks ? What is the…

apache-spark azure-databricks delta-lake azure-data-lake-gen2

asked Dec 16 '20 at 09:55

learner

votes

5 answers

Databricks: How do I get path of current notebook?

Databricks is smart and all, but how do you identify the path of your current notebook? The guide on the website does not help. It suggests: %scala dbutils.notebook.getContext.notebookPath res1: Option[String] =…

path jupyter-notebook databricks azure-databricks

asked Nov 28 '18 at 16:03

Esben Eickhardt

3,183
2
35
56

votes

7 answers

Azure Databricks - Can not create the managed table The associated location already exists

I have the following problem in Azure Databricks. Sometimes when I try to save a DataFrame as a managed table: SomeData_df.write.mode('overwrite').saveAsTable("SomeData") I get the following error: "Can not create the managed table('SomeData').…

apache-spark hive azure-data-lake databricks azure-databricks

asked Mar 27 '19 at 15:04

BuahahaXD

votes

1 answer

Local instance of Databricks for development

I am currently working on a small team that is developing a Databricks based solution. For now we are small enough to work off of cloud instances of Databricks. As the group grows this will not really be practical. Is there a "local" install of…

databricks azure-databricks aws-databricks

asked Sep 11 '20 at 03:17

John

3,458
4
33
54

votes

2 answers

Stop Execution of Databricks notebook after specific cell

I Tried sys.exit(0)(Python code) and dbutils.notebook.exit() on Databricks notebook. But both the option didn't work. Please suggest any other way to stop the execution of code after a specific cell in Databricks notebook.

azure-databricks

asked Feb 19 '21 at 10:00

sizo_abe

votes

3 answers

Checking the version of Databricks Runtime in Azure

Is it possible to check the version of Databricks Runtime in Azure?

azure version azure-databricks

asked Dec 12 '18 at 10:28

Krzysztof Słowiński

6,239
8
44
62

votes

3 answers

df to table throw error TypeError: init() got multiple values for argument 'schema'

I have dataframe in pandas :- purchase_df. I want to convert it to sql table so I can perform sql query in pandas. I tried this method purchase_df.to_sql('purchase_df', con=engine, if_exists='replace', index=False) It throw an error TypeError:…

python pandas sqlalchemy azure-databricks

asked Jan 30 '23 at 09:37

Arpan Ghimire

votes

1 answer

Printing secret value in Databricks

Even though secrets are for masking confidential information, I need to see the value of the secret for using it outside Databricks. When I simply print the secret it shows [REDACTED]. print(dbutils.secrets.get(scope="myScope",…

amazon-web-services apache-spark pyspark databricks azure-databricks

asked Nov 11 '21 at 08:49

aykcandem

votes

4 answers

list the files of a directory and subdirectory recursively in Databricks(DBFS)

Using python/dbutils, how to display the files of the current directory & subdirectory recursively in Databricks file system(DBFS).

python-3.x azure databricks azure-databricks

asked Sep 18 '20 at 12:29

Kiran A

votes

2 answers

Writing log with python logging module in databricks to azure datalake not working

I'm trying to write my own log files to Azure Datalake Gen 2 in a Python-Notebook within Databricks. I'm trying to achieve that by using the Python logging module. Unfortunately I can't get it working. No errors are raised, the folders are created…

python azure logging azure-data-lake azure-databricks

asked Apr 15 '19 at 12:47

Dominik Braun

votes

0 answers

PySpark and Protobuf Deserialization UDF Problem

I'm getting this error Can't pickle : it's not found as google.protobuf.pyext._message.CMessage when I try to create a UDF in PySpark. Apparently, it uses CloudPickle to serialize the command…

python pyspark databricks azure-databricks protobuf-python

asked May 03 '20 at 14:41

Marc Vitalis

2,129
4
24
36

votes

2 answers

How to properly access dbutils in Scala when using Databricks Connect

I'm using Databricks Connect to run code in my Azure Databricks cluster locally from IntelliJ IDEA (Scala). Everything works fine. I can connect, debug, inspect locally in the IDE. I created a Databricks Job to run my custom app JAR, but it fails…

scala databricks azure-databricks databricks-connect dbutils

asked Nov 19 '19 at 19:47

empz

11,509
16
65
106

2 3

…

99 100 Next