Delta files, delta tables and delta transactions

Question

I have a serious issue understanding Delta tables, delta transaction logs, and delta Files.

Questions:

What and where are the delta tables. I don't understand if they are in the metastore(hive), in object-store (s3) or in both.
What and where are the delta files. Same situation, I don't understand what is considered delta file. I can see parquet files both in Hive (multiple) and a single one in the /tmp folder.

This is what I resumed from reading the information.

Delta tables
- Contain underlying file location path
- Table properties
- Table Schema definition
- kept in a metastore (i.e. Apache Hive)
Delta transaction logs
- Stores every executed transaction.
- Single source of truth for delta table changes
- Enable ACID properties and thus, time-traveling and versioning
Delta Files
- Transaction Log files
- Atomic units - commits
- Stored in Cloud object Storage (AWS, MS Azure, GCP) or a file system (HDFS)
- Plain data files (Apache Parquet / OCR) or partition folders (Year-Month-Day)
Delta engine (paid)
- Performance optimization for SQL and DataFrames

I'm not sure how to understand this..

Yes, in the example I was checking they use Hive for the metastore. — Gonzalo Etse, Feb 12 '22 at 10:10

Triamus · Answer 1 · 2023-04-11T13:01:56.033

Not sure if I follow Databricks/Delta.io terminology here but either way it may help your understanding. The metadata in Delta Lake is part of the data. So a delta table would be the data files (parquet) plus the metadata (DeltaLog = _delta_log directory within a Delta table). So a delta table directory usually looks sth like below (of any file example here, there can be many files; also we ignore some details like checkpoint files):

tablename/
- part-*.snappy.parquet
- _delta_log/*.crc
- _delta_log/*.json

This is different to a parquet table which does not hold all this metadata. Irrespective of the table format (e.g. delta or parquet) you register the table in Hive. Hive will hold some metadata. Given that a delta table already holds a lot of metadata, the Hive metadata stored for it will differ from what is stored for a parquet table (or any other format). All these things are probably best explained with an example. I will use (Azure) Databricks as an engine since all is set up in the runtime already but it should equally apply to OSS Spark + Delta + Hive. Azure data lake is the storage and access protocol is abfss.

Create db in spark sql

create database if not exists test_ext location 'abfss://mycontainer@mystorage.dfs.core.windows.net/test'

On Hive DB

SELECT * FROM [dbo].[DBS]

DB_ID	DESC	DB_LOCATION_URI	NAME	OWNER_NAME	OWNER_TYPE
11		abfss://mycontainer@mystorage.dfs.core.windows.net/test	test_ext	root	USER

Create tables in spark sql and insert data

We use the classic diamonds dataset which is baked into Databricks datasets but can also be found on public sources, e.g. GitHub/tidyverse/ggplot2/diamonds. We create two tables, one delta and one parquet format.

df = spark.read.format('csv').options(header='true', inferSchema='true').load('/databricks-datasets/Rdatasets/data-001/csv/ggplot2/diamonds.csv')
df.write.format("delta").partitionBy("cut").saveAsTable("test_ext.diamonds_partitioned")
df.write.format("parquet").partitionBy("cut").saveAsTable("test_ext.diamonds_partitioned_parquet")

File system

First we look at what happened on file system level. The database is a directory here:

storage_acoount/mycontainer/test

The tables can be found within:

storage_acoount/mycontainer/test/diamonds_partitioned storage_acoount/mycontainer/test/diamonds_partitioned_parquet

Now on table level, we see (note that data files are in subdirectories due to the partitioning by column cut):

storage_acoount/mycontainer/test/diamonds_partitioned
- _delta_log
  - 000*.crc
  - 000*.json
- cut=Fair
  - part-000*.parquet
- ...
storage_acoount/mycontainer/test/diamonds_partitioned_parquet
- cut=Fair
  - part-000*.parquet
- ...
- _SUCCESS

Hive metadata

Now we can look at different metadata such as columns, partitions, table parameters, etc.

On Hive DB

select * from dbo.[TBLS]

TBL_ID	CREATE_TIME	DB_ID	LAST_ACCESS_TIME	OWNER	RETENTION	SD_ID	TBL_NAME	TBL_TYPE	VIEW_EXPANDED_TEXT	VIEW_ORIGINAL_TEXT	IS_REWRITE_ENABLED
32	1659599317	11	0	root	0	32	diamonds_partitioned	MANAGED_TABLE	NULL	NULL	0
33	1659599932	11	0	root	0	33	diamonds_partitioned_parquet	MANAGED_TABLE	NULL	NULL	0

The table parameters already show us some differences. For example, there is no partition information for the delta table. However, we will find it in the _delta_log files as the metadata is stored together with the data. On the other hand, delta brings metadata which parquet does not have.

select * from dbo.[TABLE_PARAMS]

TBL_ID	PARAM_KEY	PARAM_VALUE
32	delta.lastCommitTimestamp	1659599315000
32	delta.lastUpdateVersion	0
32	delta.minReaderVersion	1
32	delta.minWriterVersion	2
32	spark.sql.create.version	3.2.1
32	spark.sql.partitionProvider	catalog
32	spark.sql.sources.provider	delta
32	spark.sql.sources.schema	{"type":"struct","fields":[{"name":"_c0","type":"integer","nullable":true,"metadata":{}},{"name":"carat","type":"double","nullable":true,"metadata":{}},{"name":"cut","type":"string","nullable":true,"metadata":{}},{"name":"color","type":"string","nullable":true,"metadata":{}},{"name":"clarity","type":"string","nullable":true,"metadata":{}},{"name":"depth","type":"double","nullable":true,"metadata":{}},{"name":"table","type":"double","nullable":true,"metadata":{}},{"name":"price","type":"integer","nullable":true,"metadata":{}},{"name":"x","type":"double","nullable":true,"metadata":{}},{"name":"y","type":"double","nullable":true,"metadata":{}},{"name":"z","type":"double","nullable":true,"metadata":{}}]}
32	transient_lastDdlTime	1659599317
33	spark.sql.create.version	3.2.1
33	spark.sql.partitionProvider	catalog
33	spark.sql.sources.provider	parquet
33	spark.sql.sources.schema	{"type":"struct","fields":[{"name":"_c0","type":"integer","nullable":true,"metadata":{}},{"name":"carat","type":"double","nullable":true,"metadata":{}},{"name":"color","type":"string","nullable":true,"metadata":{}},{"name":"clarity","type":"string","nullable":true,"metadata":{}},{"name":"depth","type":"double","nullable":true,"metadata":{}},{"name":"table","type":"double","nullable":true,"metadata":{}},{"name":"price","type":"integer","nullable":true,"metadata":{}},{"name":"x","type":"double","nullable":true,"metadata":{}},{"name":"y","type":"double","nullable":true,"metadata":{}},{"name":"z","type":"double","nullable":true,"metadata":{}},{"name":"cut","type":"string","nullable":true,"metadata":{}}]}
33	spark.sql.sources.schema.numPartCols	1
33	spark.sql.sources.schema.partCol.0	cut
33	transient_lastDdlTime	1659599933

When we look at column info, we realize that delta only stores a generic array<string> while parquet stores metadata for all columns.

select * from dbo.[COLUMNS_V2]

CD_ID	COMMENT	COLUMN_NAME	TYPE_NAME	INTEGER_IDX
32	from deserializer	col	`array<string>`	0
33	NULL	_c0	int	0
33	NULL	carat	double	1
33	NULL	clarity	string	3
33	NULL	color	string	2
33	NULL	depth	double	4
33	NULL	price	int	6
33	NULL	table	double	5
33	NULL	x	double	7
33	NULL	y	double	8
33	NULL	z	double	9

DeltaLog metadata

Finally, we can look at different metadata in the delta log. As you can see there is a lot of table and column metadata but also data statistics.

First 0000*.json.

[
  {
    "protocol": {
      "minReaderVersion": 1,
      "minWriterVersion": 2
    }
  },
  {
    "metaData": {
      "id": "ef7ac39f-0977-4992-a087-a6a9f49fe650",
      "format": {
        "provider": "parquet",
        "options": {}
      },
      "schemaString": "{\"type\":\"struct\",\"fields\":[{\"name\":\"_c0\",\"type\":\"integer\",\"nullable\":true,\"metadata\":{}},{\"name\":\"carat\",\"type\":\"double\",\"nullable\":true,\"metadata\":{}},{\"name\":\"cut\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"color\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"clarity\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"depth\",\"type\":\"double\",\"nullable\":true,\"metadata\":{}},{\"name\":\"table\",\"type\":\"double\",\"nullable\":true,\"metadata\":{}},{\"name\":\"price\",\"type\":\"integer\",\"nullable\":true,\"metadata\":{}},{\"name\":\"x\",\"type\":\"double\",\"nullable\":true,\"metadata\":{}},{\"name\":\"y\",\"type\":\"double\",\"nullable\":true,\"metadata\":{}},{\"name\":\"z\",\"type\":\"double\",\"nullable\":true,\"metadata\":{}}]}",
      "partitionColumns": [
        "cut"
      ],
      "configuration": {},
      "createdTime": 1659599309716
    }
  },
  {
    "add": {
      "path": "cut=Fair/part-00000-fc4c091e-4b55-434c-92f6-6e87a18f6c76.c000.snappy.parquet",
      "partitionValues": {
        "cut": "Fair"
      },
      "size": 32547,
      "modificationTime": 1659599313000,
      "dataChange": true,
      "stats": "{\"numRecords\":1610,\"minValues\":{\"_c0\":9,\"carat\":0.22,\"color\":\"D\",\"clarity\":\"I1\",\"depth\":43.0,\"table\":49.0,\"price\":337,\"x\":0.0,\"y\":0.0,\"z\":0.0},\"maxValues\":{\"_c0\":53883,\"carat\":5.01,\"color\":\"J\",\"clarity\":\"VVS2\",\"depth\":79.0,\"table\":95.0,\"price\":18574,\"x\":10.74,\"y\":10.54,\"z\":6.98},\"nullCount\":{\"_c0\":0,\"carat\":0,\"color\":0,\"clarity\":0,\"depth\":0,\"table\":0,\"price\":0,\"x\":0,\"y\":0,\"z\":0}}",
      "tags": {
        "INSERTION_TIME": "1659599313000000",
        "OPTIMIZE_TARGET_SIZE": "268435456"
      }
    }
  },
  {
    "add": {
      "path": "cut=Good/part-00000-4ed7e5c8-7a6c-49a4-a2a8-9fdb2d81ead2.c000.snappy.parquet",
      "partitionValues": {
        "cut": "Good"
      },
      "size": 76488,
      "modificationTime": 1659599313000,
      "dataChange": true,
      "stats": "{\"numRecords\":4906,\"minValues\":{\"_c0\":3,\"carat\":0.23,\"color\":\"D\",\"clarity\":\"I1\",\"depth\":54.3,\"table\":51.0,\"price\":327,\"x\":0.0,\"y\":0.0,\"z\":0.0},\"maxValues\":{\"_c0\":53937,\"carat\":3.01,\"color\":\"J\",\"clarity\":\"VVS2\",\"depth\":67.0,\"table\":66.0,\"price\":18788,\"x\":9.44,\"y\":9.38,\"z\":5.79},\"nullCount\":{\"_c0\":0,\"carat\":0,\"color\":0,\"clarity\":0,\"depth\":0,\"table\":0,\"price\":0,\"x\":0,\"y\":0,\"z\":0}}",
      "tags": {
        "INSERTION_TIME": "1659599313000001",
        "OPTIMIZE_TARGET_SIZE": "268435456"
      }
    }
  },
  {
    "add": {
      "path": "cut=Ideal/part-00000-67ceda5d-c6c0-477f-8b27-0b1ba9688764.c000.snappy.parquet",
      "partitionValues": {
        "cut": "Ideal"
      },
      "size": 292297,
      "modificationTime": 1659599314000,
      "dataChange": true,
      "stats": "{\"numRecords\":21551,\"minValues\":{\"_c0\":1,\"carat\":0.2,\"color\":\"D\",\"clarity\":\"I1\",\"depth\":43.0,\"table\":43.0,\"price\":326,\"x\":0.0,\"y\":0.0,\"z\":0.0},\"maxValues\":{\"_c0\":53940,\"carat\":3.5,\"color\":\"J\",\"clarity\":\"VVS2\",\"depth\":66.7,\"table\":63.0,\"price\":18806,\"x\":9.65,\"y\":31.8,\"z\":6.03},\"nullCount\":{\"_c0\":0,\"carat\":0,\"color\":0,\"clarity\":0,\"depth\":0,\"table\":0,\"price\":0,\"x\":0,\"y\":0,\"z\":0}}",
      "tags": {
        "INSERTION_TIME": "1659599313000002",
        "OPTIMIZE_TARGET_SIZE": "268435456"
      }
    }
  },
  {
    "add": {
      "path": "cut=Premium/part-00000-8c05dee7-a5ce-4f29-a8b1-7bbc403f391d.c000.snappy.parquet",
      "partitionValues": {
        "cut": "Premium"
      },
      "size": 199912,
      "modificationTime": 1659599314000,
      "dataChange": true,
      "stats": "{\"numRecords\":13791,\"minValues\":{\"_c0\":2,\"carat\":0.2,\"color\":\"D\",\"clarity\":\"I1\",\"depth\":58.0,\"table\":51.0,\"price\":326,\"x\":0.0,\"y\":0.0,\"z\":0.0},\"maxValues\":{\"_c0\":53939,\"carat\":4.01,\"color\":\"J\",\"clarity\":\"VVS2\",\"depth\":63.0,\"table\":62.0,\"price\":18823,\"x\":10.14,\"y\":58.9,\"z\":8.06},\"nullCount\":{\"_c0\":0,\"carat\":0,\"color\":0,\"clarity\":0,\"depth\":0,\"table\":0,\"price\":0,\"x\":0,\"y\":0,\"z\":0}}",
      "tags": {
        "INSERTION_TIME": "1659599313000003",
        "OPTIMIZE_TARGET_SIZE": "268435456"
      }
    }
  },
  {
    "add": {
      "path": "cut=Very%20Good/part-00000-9a3e66de-4c74-4f30-af25-95fa2625e3a6.c000.snappy.parquet",
      "partitionValues": {
        "cut": "Very Good"
      },
      "size": 182719,
      "modificationTime": 1659599314000,
      "dataChange": true,
      "stats": "{\"numRecords\":12082,\"minValues\":{\"_c0\":6,\"carat\":0.2,\"color\":\"D\",\"clarity\":\"I1\",\"depth\":56.8,\"table\":44.0,\"price\":336,\"x\":0.0,\"y\":0.0,\"z\":0.0},\"maxValues\":{\"_c0\":53938,\"carat\":4.0,\"color\":\"J\",\"clarity\":\"VVS2\",\"depth\":64.9,\"table\":66.0,\"price\":18818,\"x\":10.01,\"y\":9.94,\"z\":31.8},\"nullCount\":{\"_c0\":0,\"carat\":0,\"color\":0,\"clarity\":0,\"depth\":0,\"table\":0,\"price\":0,\"x\":0,\"y\":0,\"z\":0}}",
      "tags": {
        "INSERTION_TIME": "1659599313000004",
        "OPTIMIZE_TARGET_SIZE": "268435456"
      }
    }
  },
  {
    "commitInfo": {
      "timestamp": 1659599315007,
      "userId": "1234567891234567",
      "userName": "user@org.com",
      "operation": "CREATE TABLE AS SELECT",
      "operationParameters": {
        "isManaged": "true",
        "description": null,
        "partitionBy": "[\"cut\"]",
        "properties": "{}"
      },
      "notebook": {
        "notebookId": "1234567891234567"
      },
      "clusterId": "1234-123456-abcdefghi",
      "isolationLevel": "WriteSerializable",
      "isBlindAppend": true,
      "operationMetrics": {
        "numFiles": "5",
        "numOutputRows": "53940",
        "numOutputBytes": "783963"
      },
      "engineInfo": "Databricks-Runtime/10.4.x-scala2.12",
      "txnId": "123abc-123abc-123abc-123abc-123abc"
    }
  }
]

And then 000*.crc.

{
  "tableSizeBytes": 783963,
  "numFiles": 5,
  "numMetadata": 1,
  "numProtocol": 1,
  "protocol": {
    "minReaderVersion": 1,
    "minWriterVersion": 2
  },
  "metadata": {
    "id": "ef7ac39f-0977-4992-a087-a6a9f49fe650",
    "format": {
      "provider": "parquet",
      "options": {}
    },
    "schemaString": "{\"type\":\"struct\",\"fields\":[{\"name\":\"_c0\",\"type\":\"integer\",\"nullable\":true,\"metadata\":{}},{\"name\":\"carat\",\"type\":\"double\",\"nullable\":true,\"metadata\":{}},{\"name\":\"cut\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"color\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"clarity\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"depth\",\"type\":\"double\",\"nullable\":true,\"metadata\":{}},{\"name\":\"table\",\"type\":\"double\",\"nullable\":true,\"metadata\":{}},{\"name\":\"price\",\"type\":\"integer\",\"nullable\":true,\"metadata\":{}},{\"name\":\"x\",\"type\":\"double\",\"nullable\":true,\"metadata\":{}},{\"name\":\"y\",\"type\":\"double\",\"nullable\":true,\"metadata\":{}},{\"name\":\"z\",\"type\":\"double\",\"nullable\":true,\"metadata\":{}}]}",
    "partitionColumns": [
      "cut"
    ],
    "configuration": {},
    "createdTime": 1659599309716
  },
  "histogramOpt": {
    "sortedBinBoundaries": [
      0,
      8192,
      16384,
      32768,
      65536,
      ...
    ],
    "fileCounts": [
      0,
      0,
      1,
      0,
      1,
      ...
    ],
    "totalBytes": [
      0,
      0,
      32547,
      0,
      76488,
      ...
    ]
  }
}

A few interesting resources around the topic

ADDITION:

A small addition as to the file format differences in Hive metadata that I recently stumbled upon and found a bit surprising. When adding comments to a table via table properties, they are directly stored under the table ID in Hive dbo.table_params in PARAM_KEY = comment (assuming you follow Hive standards for comments) for file formats like csv or parquet but they do not show up in Hive for delta tables. For delta tables we only find the comment in respective _delta_log. I find this a bit surprising as comments are official table properties in Hive LanguageManual DDL. In this case I would have expected to find the comment in the delta log as well as in Hive but probably the Delta metadata implementation has consciously decided against it. But then why have other metadata like spark.sql.sources.schema twice? It's not perfectly clear to me.

You can easily test this as shown below.

create table if not exists test.test_csv (myint int) using CSV;
create table if not exists test.test_parquet (myint int) using parquet;
create table if not exists test.test_delta (myint int) using delta;

alter table test.test_csv SET TBLPROPERTIES ('comment' = 'A table comment');
alter table test.test_parquet SET TBLPROPERTIES ('comment' = 'A table comment');
alter table test.test_delta SET TBLPROPERTIES ('comment' = 'A table comment');

--In Hive:

create or alter view dbo.v_tables as
SELECT 
    TBLS.TBL_ID
    ,DBS.NAME as DB_NAME
    ,TBLS.TBL_NAME
    ,TBLS.TBL_TYPE
FROM TBLS
    ,DBS
WHERE TBLS.DB_ID = DBS.DB_ID

select * from dbo.v_tables tbls left join dbo.table_params params on tbls.TBL_ID = params.TBL_ID
where tbls.db_name = 'test' and tbls.tbl_name = 'test_csv'

...

Delta files, delta tables and delta transactions

1 Answers1