3

I need to do some SQL queries (as here) directly from Metasore.
PS: the commands SHOW/DESCRIBE are not enough.

How to enable access from it as database, or what the database name of Metastore? ... In nowadays (2019) it is possible?


NOTES

  • What is Metastore?
    For me is a very important element of the Hive architecture, final user need some access to it... "All Hive implementation need a metastore service, where it stores metadata. It is implemented using tables in relational database. By default, Hive uses built-in Derby SQL server", 1.
    Of course, you need in your context a "standard" Metastore. On my corporation's Hadoop cluster we are planning to standardize Metastore (local and long term standard), perhaps PostgreSQL and also an (PostgREST API for external consume of some SQL-Views from it).

  • The SQL definitions (table names, etc.) will be stable and Metastore queries will be reliable when Metastore is a long-term local standard.

  • The Metastore it is closely connected to Hive, where it is a Java API, but Metastore is also a standard RDBMS and offers standard connection (by SQL) for the external universe.
    PS: my interest on Metastore is in this external context.

Peter Krauss
  • 13,174
  • 24
  • 167
  • 304
  • Metastore is a Java service with a Java API. That's the "proper" way to interact with it. – Samson Scharfrichter Aug 08 '19 at 09:11
  • Metastore(s) use a RDBMS as back-end, the connection params are either in `hive-metastore.xml` or _(legacy)_ in `hive-site.xml` used by Metastore service(s). You may try to query that one, but the model is not really documented, evolves, and has nuances for each RDBMS dialect supported. – Samson Scharfrichter Aug 08 '19 at 09:14
  • ...and in the future the back-end might be HBase and good luck for querying that one. – Samson Scharfrichter Aug 08 '19 at 09:15
  • Hi @SamsonScharfrichter, thanks for your comments. I edited with some notes... It make sense? PS: I not see Metastore as "Java API", because RDBMS connections are universal (any language and standard SQL access). – Peter Krauss Aug 08 '19 at 14:45
  • _(deep sigh)_ the Metastore **service** is a **Java application**; client/server communication uses Thrift message protocol. It has a **back-end** to persist information, which is currently a relational database, but might use HBase in the future. – Samson Scharfrichter Aug 08 '19 at 16:53
  • If you run a toy Hadoop cluster, with no proper Metastore service, then HiveServer2 (or Spark) will bootstrap an "embedded" Metastore service, with an "embedded" Derby database. But that's for test purposes only -- no real persistence, no sharing of table definitions between tools. – Samson Scharfrichter Aug 08 '19 at 16:56
  • Have a look at https://stackoverflow.com/questions/33880050/get-table-properties-out-of-hive-using-java-api – Samson Scharfrichter Aug 08 '19 at 16:58

1 Answers1

0

Spark-shell solution

Spark access Metastore under the hood, it have a first class metadata method, that returns a dataframe, the schema property, that expose names, types, etc. and offers getComment method.

See https://stackoverflow.com/a/57857021/287948

Peter Krauss
  • 13,174
  • 24
  • 167
  • 304