0

I have a big hive table with almost million records partitioned by date. I want to find the latest date based off the last partition added to the hive table. There is a probability that some days, or weeks, there might be no record. So I cannot use current_date or current_date - 1 to find the last date. I also tried max(ingest_date) but it took almost 140 minutes to find the date.

Here's a sample partition:

ingest_date=2019-6-10
ingest_date=2019-6-7
ingest_date=2019-6-6
ingest_date=2019-6-5
ingest_date=2019-6-4

Is there a better way to find the latest date in the hive table from the partitions without using MAX() function?

Incognito
  • 135
  • 4
  • 14

0 Answers0