6

i have an external partitioned table named employee with partition(year,month,day), everyday a new file come and seat at the particular day location call for today's date it will be at 2016/10/13.

TABLE SCHEMA:
create External table employee(EMPID Int,FirstName String,.....)
partitioned by (year string,month string,day string)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n' LOCATION '/.../emp';

so everyday we need to run command which is working fine as

ALTER TABLE employee ADD IF NOT EXISTS PARTITION (year=2016,month=10,day=14) LOCATION '/.../emp/2016/10/14';

but once we are trying with below command because we don't want to execute the above alter table command manually, it throws below Error

hive> MSCK REPAIR TABLE employee;
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask

Note:

hive> MSCK TABLE employee; //this show me that a partition has not added in the table
OK
Partitions not in metastore: employee:2016/10/14
Time taken: 1.066 seconds, Fetched: 1 row(s)

please help me as i stuck with this. do we have any workaround for this type of situation?

user4157124
  • 2,809
  • 13
  • 27
  • 42
anand
  • 503
  • 1
  • 7
  • 20
  • 3
    I got a workaround solution for my problem, which is if the partition name will be like 'year=2016/month=10/day=13' then we can use below command and it is working... set hive.msck.path.validation=ignore; MSCK REPAIR TABLE table_name; – anand Oct 18 '16 at 05:13

1 Answers1

33

I got a workaround solution for my problem which is, if the table static partition name is like 'year=2016/month=10/day=13' then we can use below command and it is working...

set hive.msck.path.validation=ignore;
MSCK REPAIR TABLE table_name;
anand
  • 503
  • 1
  • 7
  • 20
  • 3
    If you have a large number of partitions you might need to set hive.msck.repair.batch.size. When there is a large number of untracked partitions, there is a provision to run MSCK REPAIR TABLE batch-wise to avoid OOME (Out of Memory Error). y giving the configured batch size for the property hive.msck.repair.batch.size it can run in the batches internally. T the default value of the property is 0, it means it will execute all the partitions at once. – bsao Nov 30 '18 at 17:41