I'm using AWS Athena, I have a query as such:
SELECT * FROM foo ORDER BY purchase_date ASC
But I want to de-dupe the records.
Since this is Athena, and it needs to process records in parallel, I'm not sure how to write the DISTINCT clause.
How can I make this query so that it doesn't have any duplicate records in the result set?
Thanks