15

Hive version 1.1

I have a hive external table as below:

 CREATE EXTERNAL TABLE `schedule_events`(
  `schedule_id` string COMMENT 'from deserializer',
  `service_key` string COMMENT 'from deserializer',
  `event_start_date_time` string COMMENT 'from deserializer',
  `event_id` string COMMENT 'from deserializer',
  `event_type` string COMMENT 'from deserializer',
  `transitional_key` string COMMENT 'from deserializer',
  `created_date_time` string COMMENT 'from deserializer',
  `bus_date` string COMMENT 'from deserializer')
    PARTITIONED BY (
                    `year` string,
                    `month` string,
                    `day` string)
   ROW FORMAT SERDE
   'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
   STORED AS INPUTFORMAT
   'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
   OUTPUTFORMAT
   'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
   LOCATION
   'hdfs://nameservice1/hadoop/raw/omega/scheduled_events'
  TBLPROPERTIES (
    'avro.schema.url'='hdfs:////hadoop/raw/omega/schema/schedule_events.avsc',
   'transient_lastDdlTime'='1505742141')

Now to drop a particular partition I can run a ALTER command as below

 ALTER TABLE schedule_events DROP IF EXISTS PARTITION  (year='2016',month='06',day='01')
 Dropped the partition year=2016/month=06/day=01

 hive> show partitions schedule_events;
 OK
 year=2017/month=09/day=01
 year=2017/month=09/day=02
 year=2017/month=09/day=03
 year=2017/month=09/day=04
 year=2017/month=09/day=05

But this table is having many partitions.

How do I drop all existing partitions at once? I would like to delete all existing partitions at once? Is that possible?

peterh
  • 11,875
  • 18
  • 85
  • 108
Surender Raja
  • 3,553
  • 8
  • 44
  • 80
  • Try `ALTER TABLE schedule_events DROP IF EXISTS PARTITION (year is not null)` – Ambrish Sep 19 '17 at 20:52
  • @Ambrish I don't think that would work. Your query `ALTER TABLE schedule_events DROP IF EXISTS PARTITION (year is not null)` would check if a partition `year is not null` exists which is wrong. – Ani Menon Jan 24 '18 at 20:08

5 Answers5

28

There are multiple options, here is one:

alter table schedule_events drop if exists partition (year<>'');

Hive: Extend ALTER TABLE DROP PARTITION syntax to use all comparators

"... To drop a partition from a Hive table, this works:
ALTER TABLE foo DROP PARTITION(ds = 'date')
...but it should also work to drop all partitions prior to date.
ALTER TABLE foo DROP PARTITION(ds < 'date') This task is to implement ALTER TABLE DROP PARTITION for all of the comparators, < > <= >= <> = != instead of just for ="

https://issues.apache.org/jira/browse/HIVE-2908

David דודו Markovitz
  • 42,900
  • 6
  • 64
  • 88
  • 1
    just as a side note, I tried this on aws athena and it didn't work. I get the following error code `no viable alternative at input 'alter table TABLE_NAME drop' (service: amazonathena; status code: 400; error code: invalidrequestexception; request id: 3fe0eb78-2a17-...)` – otmezger Jun 06 '19 at 09:35
  • @otmezger, Athena has nothing to do with Hive – David דודו Markovitz Jun 06 '19 at 17:03
  • Athena is just hive under the hood, David is wrong. The issue (which is hard to discern from the error message) is that Athena insists on using double quotes instead of single quotes. – Bill Clark May 12 '20 at 19:37
  • 1
    @BillClark - No, Athena is Presto under the hood. https://docs.aws.amazon.com/athena/latest/ug/presto-functions.html https://aws.amazon.com/big-data/what-is-presto/ – David דודו Markovitz May 13 '20 at 05:20
  • 2
    Fair enough, though the differences between the two are irrelevant here. The point is the error was due to using single quotes rather than double quotes, and is not at all obvious from the error message itself. Otherwise, the syntax is identical for Hive, or Presto (and thus Athena) – Bill Clark Jun 03 '20 at 22:54
13

You may use something similar to this:

ALTER TABLE schedule_events drop if exists partition (year>'0');
Ani Menon
  • 27,209
  • 16
  • 105
  • 126
  • ALTER TABLE table_name DROP IF EXISTS PARTITION(year>0) . This query worked for me. Thanks a lot – sunitha Aug 28 '19 at 07:24
1

alter table schema_name.table_name drop partition (partition_column != '');

J_V
  • 357
  • 3
  • 6
0

Using spark sql:

val paritions_values = spark.sql("show partitions "+databasename+'.'+tablename)
.collect().map(f=>f(0).toString)
.toArray.mkString("partition(", "," , "\")")
.replace("," , "\") ,partition(")
.replace("=", "=\"")

spark.sql("alter table "+databasename+'.'+tablename+" drop "+paritions_values)
0

For example : suppose partitions are on date and the name is partition_column:-

alter table database.table_name drop if exists partition (partition_column>'2023-01-01');
Hari_pb
  • 7,088
  • 3
  • 45
  • 53