3

I need to create an external table for an read only existed folder structure.

The original hive table has partitions like year, month and day. Their files store in path in this way HDFS_PATH/2016/10/04/

But when I create an external table in hive, it always store to path like HDFS_PATH/year=2016/month=10/day=04/

Although I can change 1 partition location like in this link How to Update/Drop a Hive Partition? , but I have to set location for each partition combination.

Is there a simpler way to set it once and for all?

Community
  • 1
  • 1
Ray Wu
  • 993
  • 16
  • 34
  • Just to be clear: are you looking for a way to create one specific table with one specific folder structure for Year/Month/Day? Or do you need a generic solution that would also work for partitions like Quarter, or PostalCode? – Dennis Jaheruddin Oct 28 '16 at 07:35
  • create only 1 table, but it has many partitions – Ray Wu Oct 28 '16 at 08:19

3 Answers3

2

Thought it best to answer your question since the other answers don't explain.

The only way to use dynamic partitioning (not listing out every partition and adding individually) is to use the "Hive style" partitioning you mentioned in the format: HDFS_PATH/year=2016/month=10/day=04/.

While HCatalog has support for custom paths, Hive itself does not and there is an -open ticket tracking this feature here.

Until this ticket is resolved, you have to use the "Hive style" format you mentioned (and mentioned in this ticket) or manually add all partitions yourself using the ALTER TABLE command. Once this ticket is resolved, then MSCK REPAIR TABLE could work like you want.

gurooj
  • 2,100
  • 4
  • 21
  • 25
1

A bit late:

 alter table my_table add partition (year='2017',month='07',day='10', hour='10') location '2017/07/10/10';
belostoky
  • 934
  • 2
  • 11
  • 22
  • Thanks for your reply but then I have to enumerate all combinations of all partitions, which is what I tried to avoid. – Ray Wu Jul 12 '17 at 19:49
0

You can do this dinamically by setting: set hive.exec.dynamic.partition=true; or you can just do msck repair table table_name that will partition your data.

Farslan
  • 96
  • 7
  • it doesn't solve my requirement `without column name in path` – Ray Wu Oct 28 '16 at 17:06
  • 1
    I see, then I think you have to do set all partition combination, but you can write a scripts that increments day bay day adding partition. In this way, you can add or drop partition automatically. – Farslan Oct 30 '16 at 08:08