Questions tagged [delta-live-tables]

Databricks Delta Live Tables (DLT) is the innovative ETL framework that uses a simple declarative approach to building reliable data pipelines and automatically managing your infrastructure at scale.

Delta Live Tables simplifies development of the reliable data pipelines in Python & SQL by providing a framework that automatically handles dependencies between components, enforces the data quality, removes administrative overhead with automatic cluster & data maintenance, ...

149 questions
9
votes
1 answer

Databricks Delta Live Tables: Difference between STREAMING and INCREMENTAL

Is there a difference between CREATE STREAMING LIVE TABLE and CREATE INCREMENTAL LIVE TABLE? The documentation is mixed: For instance, STREAMING is used here, while INCREMENTAL is used here. I have tested both and so far I have not noticed any…
dwolfeu
  • 1,103
  • 2
  • 14
  • 21
7
votes
3 answers

DataBricks: Ingesting CSV data to a Delta Live Table in Python triggers "invalid characters in table name" error - how to set column mapping mode?

First off, can I just say that I am learning DataBricks at the time of writing this post, so I'd like simpler, cruder solutions as well as more sophisticated ones. I am reading a CSV file like this: df1 = spark.read.format("csv").option("header",…
Asfand Qazi
  • 6,586
  • 4
  • 32
  • 34
7
votes
2 answers

Databricks Delta Live Tables - Apply Changes from delta table

I am working with Databricks Delta Live Tables, but have some problems with upserting some tables upstream. I know it is quite a long text below, but I tried to describe my problem as clear as possible. Let me know if some parts are not clear. I…
6
votes
1 answer

How to direct DLT target table to a Unity Catalog Metastore

This question is pretty straightforward. Seems like in DLT, you can define the output table name like below: @dlt.table(name="my_table_name") def my_pipeline(): ... This writes to the hive_metastore catalog, but how do I customize it for a…
5
votes
1 answer

Delta live tables in Databricks can take only one target

If I need to publish two tables in two different databases in the metastore, do I need to create two different DLT pipeline? I am asking this because I saw that in the pipeline setting, i can only specify 1 target.
Rajib Deb
  • 1,496
  • 11
  • 30
5
votes
1 answer

Difference between LIVE TABLE and STREAMING LIVE TABLE

When using DLT, we can create a live table with either STREAMING LIVE TABLE or LIVE TABLE, as written in the docs : CREATE OR REFRESH { STREAMING LIVE TABLE | LIVE TABLE } table_name What is the difference between the two syntaxes ?
Will
  • 2,057
  • 1
  • 22
  • 34
5
votes
2 answers

Delta Live Tables with EventHub

I am trying to create streaming from eventhub using delta live tables, but I am having trouble installing the library . Is it possible to install maven library using Delta Live tables using sh /pip? I would like to…
repcak
  • 113
  • 8
5
votes
4 answers

Module 'dlt' has no attribute 'table' - databricks and delta live tables

I am new to databricks and delta live tables. I have problem with creating delta live table in python. How to create delta live table from json files in filestore?
Jelena Ajdukovic
  • 311
  • 3
  • 12
4
votes
0 answers

How does one get from Postgres RDB to Databricks Lakehouse Delta Lake?

How exactly can on create an efficient and reusable Databricks workflow for dumping raw SQL database into the Delta Lake. Some confusion here is for the best approach to the following: Handling drift in schemas (columns within DB tables) => Doing a…
4
votes
2 answers

How to monitor compute cost of DLT pipelines

I am looking for a way to find DBU cost for DLT clusters, does it get stored anywhere I have been looking into event_logs but did not find information related to cost. it does have cluster resource utilization details. here is what I found, could…
Chhaya Vishwakarma
  • 1,407
  • 9
  • 44
  • 72
4
votes
1 answer

How to publish delta live table(DLT) in different catalog instead of hive_metastore

Hi, community,   I want to publish(save) the delta live table(DLT) into a different catalog database.   The following image target field only asks for the database name, not for the catalog. I've referred to documentation but couldn't find…
4
votes
0 answers

Delta Live CDC for Aggregate State Tables

As far as I can tell from the documentation, I can not accomplish a specific migration from Delta to Delta Live that I would love to do... but I want to see if I might be missing a solution. Currently, i have a number of aggregate batch Delta tables…
4
votes
2 answers

How to import another module or package in a Databricks delta live tables

I am trying to import another module or package in my databricks delta live tables notebook and I am getting an error saying that %run or any magic command is not supported. Just wondering if there is any other way to import the modules or packages.
Abhy
  • 61
  • 5
4
votes
1 answer

Delta Live Tables for Batch Incremental Processing

Is it possible to use Delta Live Tables to perform incremental batch processing? Now, I believe that this code will always load all of the data available in the directory when a pipeline is run, CREATE LIVE TABLE lendingclub_raw COMMENT "The raw…
3
votes
1 answer

Can a Delta Live Table (DLT) be passed as a parameter to a User Defined Functions (UDF) in Databricks?

Databricks' documentation on UDFs shows very simple examples, e.g. integer transformation with integers as parameters (https://docs.databricks.com/spark/latest/spark-sql/udf-python.html), but says nothing about passing Delta Live Tables as a…
1
2 3
9 10