Repairing hive table using hiveContext in java

Question

I want to repair the hive table for any newly added/deleted partitions.Instead of manually running msck repair command in hive,is there any way to achieve this in java?I am trying to get all partitions from hdfs and from hive metastore and then after comparing them will put newly added/deleted partitions in hive metastore.But i am not able to get the api from hivecontext.I have tried to get all the partitions using hivecontext,but it is throwing error table not found.

System.out.println(hiveContext.metadataHive().getTable("anshu","mytable").getAllPartitions());

Is there any way to add/remove partitions in hive using java?

AFAIK you *must* open a direct connection to the Metastore service; Spark does not expose its own > look into https://hive.apache.org/javadocs/r2.1.1/api/ under class `HiveMetastoreClient` methods `listPartitionNames(...)` and `getPartition(...)`, then class `Partition` method `gestSd()`, then class `StorageDescriptor` method `getLocation()` — Samson Scharfrichter, Jan 14 '17 at 20:34

score 2 · Accepted Answer · edited May 23 '17 at 12:01

2

Spark Option :

using hivecontext you can execute this like below example. no need to do it manually

sqlContext = HiveContext(sc)
sqlContext.sql("MSCK REPAIR TABLE your table")

Is there any way to add/remove partitions in hive using java?

Plain java option :

If you want to do it in plain java way with out using spark, with plain java code then You can use class HiveMetaStoreClient to query directly from HiveMetaStore.

Please see my answer here with example usage

edited May 23 '17 at 12:01

Community

1
1

answered Jan 14 '17 at 20:19

Ram Ghadiyaram

28,239
13
95
121

thanks,but hive metastore listpartitions method only lists short.maxvalue(32767) partitions.If i have 1 lakh partitions then how to achieve that.Also,which approach is better --using sqlcontext.sql or listing all partitions in hive using hivemetastore and comparing it with the all the partitions in hdfs? – mahan07 Jan 15 '17 at 11:56
First thing is you have to look much closely in the issue(1 lakh partitions) you mentioned to be honest I don't know. Second thing if you are using spark hivecontext.sql is better approach instead of writing code with `HiveMetaStoreClient` if you dont want to use spark there then you have to go with `HiveMetaStoreClient` – Ram Ghadiyaram Jan 15 '17 at 13:03

Repairing hive table using hiveContext in java

1 Answers1

Spark Option :

Plain java option :

Please see my answer here with example usage