0

I am new in Hive.. I get data on first of every month. This data is stored as partitioned table in Hive. Suppose I get data in the middle of the month[any date],then do I delete the old partition and create a new partition with the latest date..

Please suggest me a solution..

himanish
  • 21
  • 2
  • What is your current partition key ? You can partition by `month` and delete data for that month and reload it. – Koushik Roy Nov 09 '20 at 14:48
  • We can make our own decision on the partition key .Do we have to delete the old partition or some alternative is there? – himanish Nov 09 '20 at 15:11
  • Partitioning scheme is not necessarily related to how you load data. And data increment may contain updates and new data for existing partition with old data. All depends on your scenario and design – leftjoin Nov 09 '20 at 15:40
  • We have to make the best decision.. – himanish Nov 09 '20 at 15:58
  • Look at this answer: https://stackoverflow.com/a/37744071/2700344 – leftjoin Nov 10 '20 at 08:01

0 Answers0