2

I have a serious issue understanding Delta tables, delta transaction logs, and delta Files.

Questions:

  • What and where are the delta tables. I don't understand if they are in the metastore(hive), in object-store (s3) or in both.
  • What and where are the delta files. Same situation, I don't understand what is considered delta file. I can see parquet files both in Hive (multiple) and a single one in the /tmp folder.

This is what I resumed from reading the information.

  • Delta tables

    • Contain underlying file location path
    • Table properties
    • Table Schema definition
    • kept in a metastore (i.e. Apache Hive)
  • Delta transaction logs

    • Stores every executed transaction.
    • Single source of truth for delta table changes
    • Enable ACID properties and thus, time-traveling and versioning
  • Delta Files

    • Transaction Log files
    • Atomic units - commits
    • Stored in Cloud object Storage (AWS, MS Azure, GCP) or a file system (HDFS)
    • Plain data files (Apache Parquet / OCR) or partition folders (Year-Month-Day)
  • Delta engine (paid)

    • Performance optimization for SQL and DataFrames

I'm not sure how to understand this..

Alex Ott
  • 80,552
  • 8
  • 87
  • 132
Gonzalo Etse
  • 123
  • 1
  • 7

1 Answers1

2

Not sure if I follow Databricks/Delta.io terminology here but either way it may help your understanding. The metadata in Delta Lake is part of the data. So a delta table would be the data files (parquet) plus the metadata (DeltaLog = _delta_log directory within a Delta table). So a delta table directory usually looks sth like below (of any file example here, there can be many files; also we ignore some details like checkpoint files):

  • tablename/
    • part-*.snappy.parquet
    • _delta_log/*.crc
    • _delta_log/*.json

This is different to a parquet table which does not hold all this metadata. Irrespective of the table format (e.g. delta or parquet) you register the table in Hive. Hive will hold some metadata. Given that a delta table already holds a lot of metadata, the Hive metadata stored for it will differ from what is stored for a parquet table (or any other format). All these things are probably best explained with an example. I will use (Azure) Databricks as an engine since all is set up in the runtime already but it should equally apply to OSS Spark + Delta + Hive. Azure data lake is the storage and access protocol is abfss.

Create db in spark sql

create database if not exists test_ext location 'abfss://mycontainer@mystorage.dfs.core.windows.net/test'

On Hive DB

SELECT * FROM [dbo].[DBS]
DB_ID DESC DB_LOCATION_URI NAME OWNER_NAME OWNER_TYPE
11 abfss://mycontainer@mystorage.dfs.core.windows.net/test test_ext root USER

Create tables in spark sql and insert data

We use the classic diamonds dataset which is baked into Databricks datasets but can also be found on public sources, e.g. GitHub/tidyverse/ggplot2/diamonds. We create two tables, one delta and one parquet format.

df = spark.read.format('csv').options(header='true', inferSchema='true').load('/databricks-datasets/Rdatasets/data-001/csv/ggplot2/diamonds.csv')
df.write.format("delta").partitionBy("cut").saveAsTable("test_ext.diamonds_partitioned")
df.write.format("parquet").partitionBy("cut").saveAsTable("test_ext.diamonds_partitioned_parquet")

File system

First we look at what happened on file system level. The database is a directory here:

storage_acoount/mycontainer/test

The tables can be found within:

storage_acoount/mycontainer/test/diamonds_partitioned storage_acoount/mycontainer/test/diamonds_partitioned_parquet

Now on table level, we see (note that data files are in subdirectories due to the partitioning by column cut):

  • storage_acoount/mycontainer/test/diamonds_partitioned

    • _delta_log
      • 000*.crc
      • 000*.json
    • cut=Fair
      • part-000*.parquet
    • ...
  • storage_acoount/mycontainer/test/diamonds_partitioned_parquet

    • cut=Fair
      • part-000*.parquet
    • ...
    • _SUCCESS

Hive metadata

Now we can look at different metadata such as columns, partitions, table parameters, etc.

On Hive DB

select * from dbo.[TBLS]
TBL_ID CREATE_TIME DB_ID LAST_ACCESS_TIME OWNER RETENTION SD_ID TBL_NAME TBL_TYPE VIEW_EXPANDED_TEXT VIEW_ORIGINAL_TEXT IS_REWRITE_ENABLED
32 1659599317 11 0 root 0 32 diamonds_partitioned MANAGED_TABLE NULL NULL 0
33 1659599932 11 0 root 0 33 diamonds_partitioned_parquet MANAGED_TABLE NULL NULL 0

The table parameters already show us some differences. For example, there is no partition information for the delta table. However, we will find it in the _delta_log files as the metadata is stored together with the data. On the other hand, delta brings metadata which parquet does not have.

select * from dbo.[TABLE_PARAMS]
TBL_ID PARAM_KEY PARAM_VALUE
32 delta.lastCommitTimestamp 1659599315000
32 delta.lastUpdateVersion 0
32 delta.minReaderVersion 1
32 delta.minWriterVersion 2
32 spark.sql.create.version 3.2.1
32 spark.sql.partitionProvider catalog
32 spark.sql.sources.provider delta
32 spark.sql.sources.schema {"type":"struct","fields":[{"name":"_c0","type":"integer","nullable":true,"metadata":{}},{"name":"carat","type":"double","nullable":true,"metadata":{}},{"name":"cut","type":"string","nullable":true,"metadata":{}},{"name":"color","type":"string","nullable":true,"metadata":{}},{"name":"clarity","type":"string","nullable":true,"metadata":{}},{"name":"depth","type":"double","nullable":true,"metadata":{}},{"name":"table","type":"double","nullable":true,"metadata":{}},{"name":"price","type":"integer","nullable":true,"metadata":{}},{"name":"x","type":"double","nullable":true,"metadata":{}},{"name":"y","type":"double","nullable":true,"metadata":{}},{"name":"z","type":"double","nullable":true,"metadata":{}}]}
32 transient_lastDdlTime 1659599317
33 spark.sql.create.version 3.2.1
33 spark.sql.partitionProvider catalog
33 spark.sql.sources.provider parquet
33 spark.sql.sources.schema {"type":"struct","fields":[{"name":"_c0","type":"integer","nullable":true,"metadata":{}},{"name":"carat","type":"double","nullable":true,"metadata":{}},{"name":"color","type":"string","nullable":true,"metadata":{}},{"name":"clarity","type":"string","nullable":true,"metadata":{}},{"name":"depth","type":"double","nullable":true,"metadata":{}},{"name":"table","type":"double","nullable":true,"metadata":{}},{"name":"price","type":"integer","nullable":true,"metadata":{}},{"name":"x","type":"double","nullable":true,"metadata":{}},{"name":"y","type":"double","nullable":true,"metadata":{}},{"name":"z","type":"double","nullable":true,"metadata":{}},{"name":"cut","type":"string","nullable":true,"metadata":{}}]}
33 spark.sql.sources.schema.numPartCols 1
33 spark.sql.sources.schema.partCol.0 cut
33 transient_lastDdlTime 1659599933

When we look at column info, we realize that delta only stores a generic array<string> while parquet stores metadata for all columns.

select * from dbo.[COLUMNS_V2]
CD_ID COMMENT COLUMN_NAME TYPE_NAME INTEGER_IDX
32 from deserializer col array<string> 0
33 NULL _c0 int 0
33 NULL carat double 1
33 NULL clarity string 3
33 NULL color string 2
33 NULL depth double 4
33 NULL price int 6
33 NULL table double 5
33 NULL x double 7
33 NULL y double 8
33 NULL z double 9

DeltaLog metadata

Finally, we can look at different metadata in the delta log. As you can see there is a lot of table and column metadata but also data statistics.

First 0000*.json.

[
  {
    "protocol": {
      "minReaderVersion": 1,
      "minWriterVersion": 2
    }
  },
  {
    "metaData": {
      "id": "ef7ac39f-0977-4992-a087-a6a9f49fe650",
      "format": {
        "provider": "parquet",
        "options": {}
      },
      "schemaString": "{\"type\":\"struct\",\"fields\":[{\"name\":\"_c0\",\"type\":\"integer\",\"nullable\":true,\"metadata\":{}},{\"name\":\"carat\",\"type\":\"double\",\"nullable\":true,\"metadata\":{}},{\"name\":\"cut\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"color\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"clarity\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"depth\",\"type\":\"double\",\"nullable\":true,\"metadata\":{}},{\"name\":\"table\",\"type\":\"double\",\"nullable\":true,\"metadata\":{}},{\"name\":\"price\",\"type\":\"integer\",\"nullable\":true,\"metadata\":{}},{\"name\":\"x\",\"type\":\"double\",\"nullable\":true,\"metadata\":{}},{\"name\":\"y\",\"type\":\"double\",\"nullable\":true,\"metadata\":{}},{\"name\":\"z\",\"type\":\"double\",\"nullable\":true,\"metadata\":{}}]}",
      "partitionColumns": [
        "cut"
      ],
      "configuration": {},
      "createdTime": 1659599309716
    }
  },
  {
    "add": {
      "path": "cut=Fair/part-00000-fc4c091e-4b55-434c-92f6-6e87a18f6c76.c000.snappy.parquet",
      "partitionValues": {
        "cut": "Fair"
      },
      "size": 32547,
      "modificationTime": 1659599313000,
      "dataChange": true,
      "stats": "{\"numRecords\":1610,\"minValues\":{\"_c0\":9,\"carat\":0.22,\"color\":\"D\",\"clarity\":\"I1\",\"depth\":43.0,\"table\":49.0,\"price\":337,\"x\":0.0,\"y\":0.0,\"z\":0.0},\"maxValues\":{\"_c0\":53883,\"carat\":5.01,\"color\":\"J\",\"clarity\":\"VVS2\",\"depth\":79.0,\"table\":95.0,\"price\":18574,\"x\":10.74,\"y\":10.54,\"z\":6.98},\"nullCount\":{\"_c0\":0,\"carat\":0,\"color\":0,\"clarity\":0,\"depth\":0,\"table\":0,\"price\":0,\"x\":0,\"y\":0,\"z\":0}}",
      "tags": {
        "INSERTION_TIME": "1659599313000000",
        "OPTIMIZE_TARGET_SIZE": "268435456"
      }
    }
  },
  {
    "add": {
      "path": "cut=Good/part-00000-4ed7e5c8-7a6c-49a4-a2a8-9fdb2d81ead2.c000.snappy.parquet",
      "partitionValues": {
        "cut": "Good"
      },
      "size": 76488,
      "modificationTime": 1659599313000,
      "dataChange": true,
      "stats": "{\"numRecords\":4906,\"minValues\":{\"_c0\":3,\"carat\":0.23,\"color\":\"D\",\"clarity\":\"I1\",\"depth\":54.3,\"table\":51.0,\"price\":327,\"x\":0.0,\"y\":0.0,\"z\":0.0},\"maxValues\":{\"_c0\":53937,\"carat\":3.01,\"color\":\"J\",\"clarity\":\"VVS2\",\"depth\":67.0,\"table\":66.0,\"price\":18788,\"x\":9.44,\"y\":9.38,\"z\":5.79},\"nullCount\":{\"_c0\":0,\"carat\":0,\"color\":0,\"clarity\":0,\"depth\":0,\"table\":0,\"price\":0,\"x\":0,\"y\":0,\"z\":0}}",
      "tags": {
        "INSERTION_TIME": "1659599313000001",
        "OPTIMIZE_TARGET_SIZE": "268435456"
      }
    }
  },
  {
    "add": {
      "path": "cut=Ideal/part-00000-67ceda5d-c6c0-477f-8b27-0b1ba9688764.c000.snappy.parquet",
      "partitionValues": {
        "cut": "Ideal"
      },
      "size": 292297,
      "modificationTime": 1659599314000,
      "dataChange": true,
      "stats": "{\"numRecords\":21551,\"minValues\":{\"_c0\":1,\"carat\":0.2,\"color\":\"D\",\"clarity\":\"I1\",\"depth\":43.0,\"table\":43.0,\"price\":326,\"x\":0.0,\"y\":0.0,\"z\":0.0},\"maxValues\":{\"_c0\":53940,\"carat\":3.5,\"color\":\"J\",\"clarity\":\"VVS2\",\"depth\":66.7,\"table\":63.0,\"price\":18806,\"x\":9.65,\"y\":31.8,\"z\":6.03},\"nullCount\":{\"_c0\":0,\"carat\":0,\"color\":0,\"clarity\":0,\"depth\":0,\"table\":0,\"price\":0,\"x\":0,\"y\":0,\"z\":0}}",
      "tags": {
        "INSERTION_TIME": "1659599313000002",
        "OPTIMIZE_TARGET_SIZE": "268435456"
      }
    }
  },
  {
    "add": {
      "path": "cut=Premium/part-00000-8c05dee7-a5ce-4f29-a8b1-7bbc403f391d.c000.snappy.parquet",
      "partitionValues": {
        "cut": "Premium"
      },
      "size": 199912,
      "modificationTime": 1659599314000,
      "dataChange": true,
      "stats": "{\"numRecords\":13791,\"minValues\":{\"_c0\":2,\"carat\":0.2,\"color\":\"D\",\"clarity\":\"I1\",\"depth\":58.0,\"table\":51.0,\"price\":326,\"x\":0.0,\"y\":0.0,\"z\":0.0},\"maxValues\":{\"_c0\":53939,\"carat\":4.01,\"color\":\"J\",\"clarity\":\"VVS2\",\"depth\":63.0,\"table\":62.0,\"price\":18823,\"x\":10.14,\"y\":58.9,\"z\":8.06},\"nullCount\":{\"_c0\":0,\"carat\":0,\"color\":0,\"clarity\":0,\"depth\":0,\"table\":0,\"price\":0,\"x\":0,\"y\":0,\"z\":0}}",
      "tags": {
        "INSERTION_TIME": "1659599313000003",
        "OPTIMIZE_TARGET_SIZE": "268435456"
      }
    }
  },
  {
    "add": {
      "path": "cut=Very%20Good/part-00000-9a3e66de-4c74-4f30-af25-95fa2625e3a6.c000.snappy.parquet",
      "partitionValues": {
        "cut": "Very Good"
      },
      "size": 182719,
      "modificationTime": 1659599314000,
      "dataChange": true,
      "stats": "{\"numRecords\":12082,\"minValues\":{\"_c0\":6,\"carat\":0.2,\"color\":\"D\",\"clarity\":\"I1\",\"depth\":56.8,\"table\":44.0,\"price\":336,\"x\":0.0,\"y\":0.0,\"z\":0.0},\"maxValues\":{\"_c0\":53938,\"carat\":4.0,\"color\":\"J\",\"clarity\":\"VVS2\",\"depth\":64.9,\"table\":66.0,\"price\":18818,\"x\":10.01,\"y\":9.94,\"z\":31.8},\"nullCount\":{\"_c0\":0,\"carat\":0,\"color\":0,\"clarity\":0,\"depth\":0,\"table\":0,\"price\":0,\"x\":0,\"y\":0,\"z\":0}}",
      "tags": {
        "INSERTION_TIME": "1659599313000004",
        "OPTIMIZE_TARGET_SIZE": "268435456"
      }
    }
  },
  {
    "commitInfo": {
      "timestamp": 1659599315007,
      "userId": "1234567891234567",
      "userName": "user@org.com",
      "operation": "CREATE TABLE AS SELECT",
      "operationParameters": {
        "isManaged": "true",
        "description": null,
        "partitionBy": "[\"cut\"]",
        "properties": "{}"
      },
      "notebook": {
        "notebookId": "1234567891234567"
      },
      "clusterId": "1234-123456-abcdefghi",
      "isolationLevel": "WriteSerializable",
      "isBlindAppend": true,
      "operationMetrics": {
        "numFiles": "5",
        "numOutputRows": "53940",
        "numOutputBytes": "783963"
      },
      "engineInfo": "Databricks-Runtime/10.4.x-scala2.12",
      "txnId": "123abc-123abc-123abc-123abc-123abc"
    }
  }
]

And then 000*.crc.

{
  "tableSizeBytes": 783963,
  "numFiles": 5,
  "numMetadata": 1,
  "numProtocol": 1,
  "protocol": {
    "minReaderVersion": 1,
    "minWriterVersion": 2
  },
  "metadata": {
    "id": "ef7ac39f-0977-4992-a087-a6a9f49fe650",
    "format": {
      "provider": "parquet",
      "options": {}
    },
    "schemaString": "{\"type\":\"struct\",\"fields\":[{\"name\":\"_c0\",\"type\":\"integer\",\"nullable\":true,\"metadata\":{}},{\"name\":\"carat\",\"type\":\"double\",\"nullable\":true,\"metadata\":{}},{\"name\":\"cut\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"color\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"clarity\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"depth\",\"type\":\"double\",\"nullable\":true,\"metadata\":{}},{\"name\":\"table\",\"type\":\"double\",\"nullable\":true,\"metadata\":{}},{\"name\":\"price\",\"type\":\"integer\",\"nullable\":true,\"metadata\":{}},{\"name\":\"x\",\"type\":\"double\",\"nullable\":true,\"metadata\":{}},{\"name\":\"y\",\"type\":\"double\",\"nullable\":true,\"metadata\":{}},{\"name\":\"z\",\"type\":\"double\",\"nullable\":true,\"metadata\":{}}]}",
    "partitionColumns": [
      "cut"
    ],
    "configuration": {},
    "createdTime": 1659599309716
  },
  "histogramOpt": {
    "sortedBinBoundaries": [
      0,
      8192,
      16384,
      32768,
      65536,
      ...
    ],
    "fileCounts": [
      0,
      0,
      1,
      0,
      1,
      ...
    ],
    "totalBytes": [
      0,
      0,
      32547,
      0,
      76488,
      ...
    ]
  }
}

A few interesting resources around the topic

ADDITION:

A small addition as to the file format differences in Hive metadata that I recently stumbled upon and found a bit surprising. When adding comments to a table via table properties, they are directly stored under the table ID in Hive dbo.table_params in PARAM_KEY = comment (assuming you follow Hive standards for comments) for file formats like csv or parquet but they do not show up in Hive for delta tables. For delta tables we only find the comment in respective _delta_log. I find this a bit surprising as comments are official table properties in Hive LanguageManual DDL. In this case I would have expected to find the comment in the delta log as well as in Hive but probably the Delta metadata implementation has consciously decided against it. But then why have other metadata like spark.sql.sources.schema twice? It's not perfectly clear to me.

You can easily test this as shown below.

create table if not exists test.test_csv (myint int) using CSV;
create table if not exists test.test_parquet (myint int) using parquet;
create table if not exists test.test_delta (myint int) using delta;

alter table test.test_csv SET TBLPROPERTIES ('comment' = 'A table comment');
alter table test.test_parquet SET TBLPROPERTIES ('comment' = 'A table comment');
alter table test.test_delta SET TBLPROPERTIES ('comment' = 'A table comment');

--In Hive:

create or alter view dbo.v_tables as
SELECT 
    TBLS.TBL_ID
    ,DBS.NAME as DB_NAME
    ,TBLS.TBL_NAME
    ,TBLS.TBL_TYPE
FROM TBLS
    ,DBS
WHERE TBLS.DB_ID = DBS.DB_ID

select * from dbo.v_tables tbls left join dbo.table_params params on tbls.TBL_ID = params.TBL_ID
where tbls.db_name = 'test' and tbls.tbl_name = 'test_csv'

...
Triamus
  • 2,415
  • 5
  • 27
  • 37