-1

We have a hive managed table (its both partitioned and bucketed, and transaction = 'true'). We are using Spark (version 2.4) to interact with this hive table.

We are able to successfully ingest data into this table using following;

sparkSession.sql("insert into table values(''))

But we are not able to delete a row from this table. We are attempting to delete using below command;

sparkSession.sql("delete from table where col1 = '' and col2 = '')

We are getting operationNotAccepted exception.

Do we need to do anything specific to be able to perform this action?

Thanks

Anuj

Anuj Mehra
  • 320
  • 3
  • 19
  • Did you check this ? - https://stackoverflow.com/questions/17810537/how-to-delete-and-update-a-record-in-hive – abc_spark Jul 29 '22 at 09:46

1 Answers1

0

Unless DELTA table, this is not possible.

ORC does not support delete for Hive bucketed tables. See https://github.com/qubole/spark-acid

HUDI on AWS could also be an option.

thebluephantom
  • 16,458
  • 8
  • 40
  • 83