Using an expression in a PARTITIONED BY definition in Delta Table

Question

Attempting to load data into Databricks using COPY INTO, I have data in storage (as CSV files) that has the following schema:

event_time TIMESTAMP,
aws_region STRING,
event_id STRING,
event_name STRING

I wish for the target table to be partitioned by DAY, which should be extracted from the event_time column.

However, attempting to use an expression in the PARTITIONED BY column yields the following error:

CREATE TABLE IF NOT EXISTS MY_TABLE (
  event_time TIMESTAMP,
  aws_region STRING,
  event_id STRING,
  event_name STRING
)
PARTITIONED BY(TO_DATE(event_time))

java.sql.SQLException: [Databricks]DatabricksJDBCDriver ERROR processing query/statement. Error Code: 0, SQL state: org.apache.hive.service.cli.HiveSQLException: Error running query: [DELTA_OPERATION_NOT_ALLOWED] com.databricks.sql.transaction.tahoe.DeltaAnalysisException: Operation not allowed: Partitioning by expressions is not supported for Delta tables

My limitation is that I cannot alter the existing schema, for example by creating a new field:

event_date DATE

And use that as the partition by column.

Is there any other way to overcome this?

score 0 · Answer 1 · edited Mar 14 '23 at 16:03

0

You need to use GENERATED column like this (doc):

CREATE TABLE IF NOT EXISTS MY_TABLE (
  event_time TIMESTAMP,
  event_date date GENERATED ALWAYS AS (TO_DATE(event_time)),
  aws_region STRING,
  event_id STRING,
  event_name STRING
)
PARTITIONED BY(event_date)

You can read more about generated columns in the linked documentation or here.

edited Mar 14 '23 at 16:03

Peter Kolárovec

3
2

answered Sep 01 '22 at 12:29

Alex Ott

80,552
8
87
132

I don't wish for the table to create an `event_date` column, as specified in the question. That's part of the limitation. – Yuval Itzchakov Sep 01 '22 at 12:42
you can't use an expression in the `partitioned by`... Spark needs a column – Alex Ott Sep 01 '22 at 13:51

Using an expression in a PARTITIONED BY definition in Delta Table

1 Answers1