I am using Spark Structured Streaming (3.1.1)
to read data from Kafka
and use HUDI (0.8.0)
as the storage system on S3 partitioning the data by date. (no problems with this section)
I am looking to use Trino (355)
to be able to query that data. As a pre-curser, I've already placed the hudi-presto-bundle-0.8.0.jar
in /data/trino/hive/
I created a table with the following schema
CREATE TABLE table_new (
columns, dt
) WITH (
partitioned_by = ARRAY['dt'],
external_location = 's3a://bucket/location/',
format = 'parquet'
);
Even after calling the below function, trino is unable to discover any partitions
CALL system.sync_partition_metadata('schema', 'table_new', 'ALL')
My assessment is that I am unable to create a table under trino using hudi largely due to the fact that I am not able to pass the right values under WITH
Options.
I am also unable to find a create table example under documentation for HUDI.
I would really appreciate if anyone can give me a example for that, or point me to the right direction, if in case I've missed anything.
Really appreciate the help
Small Update: Tried Adding
connector = 'hudi'
but this throws the error:
Catalog 'hive' does not support table property 'connector'