Drop column of hive table stored as orc

Question

I have hive table created as below:

create table alpha001(id int, name string) clustered by (id) into 2 buckets stored as orc TBLPROPERTIES ('transactional'='true')

Now i want to drop one of the columns, say 'name'. I tried the following:

ALTER TABLE alpha001 REPLACE COLUMNS (id int);

which results in below

Exception thrown: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Replace columns is not supported for table default.alpha001. SerDe may be incompatible.

and following

ALTER TABLE alpha001 DROP name;

Exception thrown : FAILED: ParseException line 1:26 mismatched input 'name' expecting PARTITION near 'DROP' in drop partition statement

Depends on the version of Hive you are using. The latest hive releases support drop column. — V Sree Harissh, Aug 18 '15 at 16:49

score 8 · Answer 1 · answered Feb 22 '18 at 06:52

Unfortunately, you can't! The only way you can delete column from existing table is by using REPLACE COLUMNS keyword. But this can be done only for tables with a native SerDe (DynamicSerDe, MetadataTypedColumnsetSerDe, LazySimpleSerDe and ColumnarSerDe).

Your best bet is recreating the schema. Follows the steps.

Check if the table is external. If it isn't, use the following statement to make it external.
```
alter table alpha001 set tblproperties('EXTERNAL'='TRUE');
```
Drop the table. Since the table is an external table, you can drop it without dropping the actual table.
Recreate the table with the new schema. You should be able to access the table with new schema.

Follows a quick sample.

create table alpha001(id int, name string) clustered by (id) into 2 buckets stored as orc TBLPROPERTIES ('transactional'='true');

--assuming your table is not EXTERNAL already
alter table alpha001 set tblproperties('EXTERNAL'='TRUE');

insert into alpha001 values(1,"A");

select * from alpha001;

OK
1       A

drop table alpha001;

create table alpha001(id int) clustered by (id) into 2 buckets stored as orc TBLPROPERTIES ('transactional'='true');

select * from alpha001;

OK
1
Time tak

Hope that helps!

This is so unbelievable. What a hassle just for dropping a column. — Naitree, Oct 22 '20 at 05:42

score 0 · Answer 2 · answered Aug 18 '15 at 11:15

0

you cant drop column for hive table. On using "ALTER TABLE alpha001 DROP name" it gives error because in hive dropping column is not supported but we cant drop partition.So here ,it searches for Partition name near drop which 'name' column is not.

So in order to drop column ,you have to 1.drop table and re-create table with column eliminated(but may result in loss of data if its internal table) 2.you can use REPLACE command(refer this https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-AlterTable/Partition/Column)

answered Aug 18 '15 at 11:15

Ankit Agrahari

349
9
22

The REPLACE command is working only in case we are creating table like this : create table xyz(id int, name string) – Ann Aug 19 '15 at 12:37
1

But if we are creating table like this : create table alpha001(id int, name string) clustered by (id) into 2 buckets stored as orc TBLPROPERTIES ('transactional'='true') , then we cant use ALTER TABLE alpha001 REPLACE COLUMNS (id int). It throws FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Replace columns is not supported for table default.alpha001. SerDe may be incompatible. – Ann Aug 19 '15 at 12:39

score 0 · Answer 3 · edited May 12 '22 at 13:09

You are using the replace command in a wrong way, you need to omit the columns that you want to drop while using replace command.

describe formatted bucketed_table;
OK 
col_name        data_type       comment
# col_name              data_type               comment

id                      int
firstname               string
last_name               string

hive>alter table bucketed_table drop column id;

MismatchedTokenException(58!=196)
        at org.antlr.runtime.BaseRecognizer.recoverFromMismatchedToken(BaseRecognizer.java:61                                                                                                7)
        at org.antlr.runtime.BaseRecognizer.match(BaseRecognizer.java:115)
        at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.dropPartitionSpec(Hiv                                                                                                eParser_IdentifiersParser.java:10839)
        at org.apache.hadoop.hive.ql.parse.HiveParser.dropPartitionSpec(HiveParser.java:49854                                                                                                )
        at org.apache.hadoop.hive.ql.parse.HiveParser.alterStatementSuffixDropPartitions(Hive                                                                                                Parser.java:13140)
        at org.apache.hadoop.hive.ql.parse.HiveParser.alterTableStatementSuffix(HiveParser.ja                                                                                                va:9370)
        at org.apache.hadoop.hive.ql.parse.HiveParser.alterStatement(HiveParser.java:8803)
        at org.apache.hadoop.hive.ql.parse.HiveParser.ddlStatement(HiveParser.java:4057)
        at org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:1786)
        at org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1152)
        at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:211)
        at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:171)
        at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:438)
        at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:321)
        at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1224)
        at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1265)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1161)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1151)
        at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:217)
        at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:169)
        at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:380)
        at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:740)
        at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:685)
        at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:                                                                                                43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.util.RunJar.run(RunJar.java:233)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
FAILED: ParseException line 1:32 mismatched input 'column' expecting PARTITION near 'drop' in                                                                                                 drop partition statement


hive> alter table bucketed_table replace columns(firstname string , lastname string);
OK
Time taken: 0.164 seconds


hive> select * from bucketed_table;
OK
bucketed_table.firstname        bucketed_table.lastname
3       nikhil
3       nikhil
1       tutu
1       avi
4       tanvi
10      vivek
1       tutu
1       avi
4       tanvi
10      vivek
5       akshay
5       akshay
Time taken: 0.138 seconds, Fetched: 12 row(s)


hive> describe bucketed_table;
OK
col_name        data_type       comment
firstname               string
lastname                string

so from above even if you replace the column , if the bucketing is done on that column it will still persist but if that column doesn't have bucketing on it then it will be removed both from metadata and hdfs.

score 0 · Answer 4 · answered Aug 19 '19 at 14:12

0

I just found out that we are not able to drop the columns in table which is stored as ORC. And only way is we can use "create table as select" option to create a new table with desired columns.

answered Aug 19 '19 at 14:12

user11789810

15
5

Drop column of hive table stored as orc

4 Answers4

Linked