7

1- I'm trying to delete multiple partitions at once, but struggling to do it with either Impala or Hive. I tried the following query, with and without ':

ALTER TABLE cz_prd_corrti_st.s1mme_transstats_info DROP IF EXISTS PARTITION (pr_load_time='20170701000317') PARTITION (pr_load_time='20170701000831')

The error I'm getting is as follow:

AnalysisException: Syntax error in line 3: PARTITION (pr_load_time='20170701000831') ^ Encountered: PARTITION Expected: CACHED, LOCATION, PURGE, SET, UNCACHED CAUSED BY: Exception: Syntax error

The partition column is bigint type, query for deleting only one partition works as expected:

ALTER TABLE cz_prd_corrti_st.s1mme_transstats_info DROP IF EXISTS
PARTITION   (pr_load_time='20170701000317')

2- Is it a good practice delete the hdfs data first and then drop the partitions in Impala/Hive, or is it supposed to be done vice versa?

Massimiliano Kraus
  • 3,638
  • 5
  • 27
  • 47
k_mishap
  • 451
  • 2
  • 8
  • 17

1 Answers1

21

1.

Your syntax is wrong.
In the DROP command the partitions should be separated by commas.

Demo

hive> create table t (i int) partitioned by (p int);
OK

hive> alter table t add partition (p=1) partition(p=2) partition(p=3) partition(p=4) partition(p=5);
OK

hive> show partitions t;
OK
partition
p=1
p=2
p=3
p=4
p=5

hive> alter table t drop if exists partition (p=1),partition (p=2),partition(p=3);
Dropped the partition p=1
Dropped the partition p=2
Dropped the partition p=3
OK

hive> show partitions t;
OK
partition
p=4
p=5

2.

You can drop a range.

Demo

hive> create table t (i int) partitioned by (p int);
OK

hive> alter table t add partition (p=1) partition(p=2) partition(p=3) partition(p=4) partition(p=5);
OK

hive> show partitions t;
OK
partition
p=1
p=2
p=3
p=4
p=5

hive> alter table t drop if exists partition (p<=3);
Dropped the partition p=1
Dropped the partition p=2
Dropped the partition p=3
OK

hive> show partitions t;
OK
partition
p=4
p=5
David דודו Markovitz
  • 42,900
  • 6
  • 64
  • 88
  • You're indeed correct, Hive works with mentioned syntax. Havent found solution for Impala yet. – k_mishap Aug 08 '17 at 08:30
  • 2
    Unfortunately comparators in partition predicate (`partition (p<=3)`) do not work in Spark SQL yet, see https://issues.apache.org/jira/browse/SPARK-14922 – ruhong Jan 03 '19 at 04:47
  • The second demo works with Impala 2.8+. See https://impala.apache.org/docs/build/html/topics/impala_alter_table.html where it says "In Impala 2.8 and higher, the expression for the partition clause with a DROP or SET operation can include comparison operators such as <, IN, or BETWEEN, and Boolean operators such as AND and OR." – Michael Schaefers Nov 26 '21 at 10:33