89

After adding a partition to an external table in Hive, how can I update/drop it?

darcyy
  • 5,236
  • 5
  • 28
  • 41

6 Answers6

186

You can update a Hive partition by, for example:

ALTER TABLE logs PARTITION(year = 2012, month = 12, day = 18) 
SET LOCATION 'hdfs://user/darcy/logs/2012/12/18';

This command does not move the old data, nor does it delete the old data. It simply sets the partition to the new location.

To drop a partition, you can do

ALTER TABLE logs DROP IF EXISTS PARTITION(year = 2012, month = 12, day = 18);
starball
  • 20,030
  • 7
  • 43
  • 238
darcyy
  • 5,236
  • 5
  • 28
  • 41
  • 2
    Just FYI, for Spark SQL this will also not work to update an existing partition's location, mostly because the Spark SQL API does not support it. Also from the Hive CLI, you would need to run `use ` first, otherwise it will fail with `FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Unable to alter partition. Unable to alter partitions because table or database does not exist.` even if the table does exist. So if the table name is `.`, from the CLI you have to run `use ` first before changing the partition location.
    – Marcus Aug 22 '20 at 00:10
  • This appears to hang forever with an ORC table – MikeKulls Sep 08 '20 at 01:04
  • I take that back, it just takes 3 minutes to drop an empty partition. Normal Hadoop performance. – MikeKulls Sep 08 '20 at 01:10
20

in addition, you can drop multiple partitions from one statement (Dropping multiple partitions in Impala/Hive).

Extract from above link:

hive> alter table t drop if exists partition (p=1),partition (p=2),partition(p=3);
Dropped the partition p=1
Dropped the partition p=2
Dropped the partition p=3
OK

EDIT 1:

Also, you can drop bulk using a condition sign (>,<,<>), for example:

Alter table t 
drop partition (PART_COL>1);
F.Lazarescu
  • 1,385
  • 2
  • 16
  • 31
5
Alter table table_name drop partition (partition_name);
Paul Roub
  • 36,322
  • 27
  • 84
  • 93
Dev Kumar
  • 89
  • 1
  • 3
  • Please add some explanation to your answer such that others can learn from it - there are already other answers using other approaches. Can you explain why your's looks different? – Nico Haase May 15 '20 at 13:46
  • this does not work, why does it have so many likes? -1 from me – pavel_orekhov Aug 09 '23 at 14:51
2

You can either copy files into the folder where external partition is located or use

INSERT OVERWRITE TABLE tablename1 PARTITION (partcol1=val1, partcol2=val2...)...

statement.

brenjt
  • 15,997
  • 13
  • 77
  • 118
Sergey Zyuzin
  • 3,754
  • 1
  • 24
  • 17
2

You may also need to make database containing table active

use [dbname]

otherwise you may get error (even if you specify database i.e. dbname.table )

FAILED Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Unable to alter partition. Unable to alter partitions because table or database does not exist.

  • Any idea if there's a workaround for this for doing the same operation in `spark.sql()`, which only supports one command at a time (and `spark.sql("use ")` followed by `spark.sql("alter table ...")` does not work, especially for cases where you'd want to change a partition format which is not allowed in add partition. – Marcus Apr 17 '20 at 21:30
0

Apart from other answers in this post, for multiple partitions, do this

ALTER TABLE database.table_name DROP PARTITION (partition_column >= value);

Example for database employee with table name accounts, and partition column event_date, we do:-

ALTER TABLE employee.accounts DROP PARTITION (event_date>='2023-02-25');

This will drop all partitions from 25th Feb 2023 to the current date.

Hari_pb
  • 7,088
  • 3
  • 45
  • 53