I have a HiveQL script that can do some operations based on a hive table. But before doing these operations, I will check whether the partition needed exists, and if not, I will terminate the script. So how can I achieve it?
Asked
Active
Viewed 2,071 times
1
-
If you use Hive why is it marked as [apache-spark]? – Alper t. Turker May 04 '18 at 10:12
1 Answers
2
Using shell:
table_name="schema.table"
partition_spec="key=value"
partition_exists=$(hive -e "show partitions $table_name" | grep "$partition_spec");
#check partition_exists
if [ "$partition_exists" = "" ]; then echo not exists; else echo exists; fi

leftjoin
- 36,950
- 8
- 57
- 116
-
just a side note. If above is put in a shell script, and this shell script is called via another script (say called parent.sh), then the parent.sh should not have "set -e" condition, otherwise the "echo not exists will not will be executed..... reason, the sub shell partition_exists actually return non zero code if the partition does not exist – soMuchToLearnAndShare Apr 27 '20 at 15:51
-
instead of removing the set -e, it can be done like this (sorry, i could not edit the above comment after certain time): https://stackoverflow.com/a/53612582/4582240 – soMuchToLearnAndShare Apr 27 '20 at 16:15