I have seen here https://aws.amazon.com/about-aws/whats-new/2020/09/amazon-redshift-spectrum-adds-support-for-querying-open-source-apache-hudi-and-delta-lake/ that Redshift Spectrum has support for Hudi and Delta. We're using Iceberg right now as a file format, and we have the requirement to read some tables externally in redshift spectrum for the BI Team. I have created an external schema and an external table, but when I try to read the table, Redshift spectrum give me more data then we should. We are upserting data based in primary key, so what happens in redshift spectrum the way I tried is that it returns me all records for the same id, instead of returning me only the latest version of it (like a partition by id) - wondering if anyone has tried it with success to integrate Iceberg with AWS Redshift Spectrum?
Asked
Active
Viewed 170 times