4

I know if I submit the query from Hive,a shared lock will be acquired and then the hive table will get locked by the query: https://cwiki.apache.org/confluence/display/Hive/Locking

So I just wonder if the query is executed by Spark Hivecontext, will the lock required and will the table get locked as well? Also, if I insert the data into table through Spark Hivecontext, will it require a exclusive lock?

Thanks

JerryLi
  • 151
  • 2
  • 10
  • Good question. The Hive Metastore API exposes methods such as `MetaStoreClient.lock(LockRequest)` returning a `LockResponse` (cf. https://hive.apache.org/javadocs/r2.1.1/api/index.html?org/apache/hadoop/hive/metastore/HiveMetaStoreClient.html) but at first glance, the Spark code base does not use either `LockRequest` or `LockResponse`. So I guess Spark can be locked out by a Hive query (cf. http://stackoverflow.com/questions/42421883/not-able-to-create-view-on-hive-table-using-hivecontext-getting-dblock-manager), but Spark will not take locks by itself... – Samson Scharfrichter Mar 09 '17 at 22:42
  • ...unless you request the lock yourself, with an explicit "LOCK TABLE" command (cf. http://stackoverflow.com/questions/36474638/locking-hive-table-from-spark-hivecontext) – Samson Scharfrichter Mar 09 '17 at 22:44
  • BTW you can check by yourself: open a Hive session, start a Spark job in another console, and while Spark is loading data, run "SHOW LOCKS" commands in Hive. Maybe Spark manages locks by hitting ZooKepper directly without using the MetaStore API, but I doubt it. – Samson Scharfrichter Mar 09 '17 at 22:50

1 Answers1

1

It was supported in Spark SQL v.1.6, and it is not supported in 2.x and 3.x versions.

https://github.com/apache/spark/blob/branch-2.2/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4

unsupportedHiveNativeCommands
...
    | kw1=LOCK kw2=TABLE
    | kw1=LOCK kw2=DATABASE
    | kw1=UNLOCK kw2=TABLE
    | kw1=UNLOCK kw2=DATABASE
Fuad Efendi
  • 155
  • 1
  • 9