We have DBT models which we use to run on AWS Athena tables. It creates Hive external tables behind the scenes. Now we have a situation where the data type of column may change in future. Athena tables based on Hive does not allow to change data type of a column, but Apache Iceberg table does. We can change data type of a column in Apache Iceberg tables.
We copied the data from old Hive table to Iceberg table but when we run the DBT model, it below error:
[error] [MainThread]: An error occurred (InvalidInputException) when calling the GetPartitions operation:
The DBT configuration for the model is as follows. It used to work with Hive external tables but not working with Apache Iceberg tables.
{{
config(materialized='incremental',
external_location="s3://" + env_var('BUCKET-NAME') + "/" + env_var('SCHEMA-NAME') + "/" + this.identifier,
partitioned_by = ['event_date'],
incremental_strategy='insert_overwrite',
on_schema_change='ignore'
)
}}
The Apache Iceberg is created as below:
CREATE TABLE iceberg_table (
id int,
data string,
event_date string)
PARTITIONED BY (event_date)
LOCATION 's3://DOC-EXAMPLE-BUCKET/iceberg-folder'
TBLPROPERTIES (
'table_type'='ICEBERG',
'format'='parquet',
'write_target_data_file_size_bytes'='536870912',
'optimize_rewrite_delete_file_threshold'='10'
)