0
Create table t1(id int)

I was firing above query on Hive 2.3.6 (MapR Hadoop Distribution 6.3.0).

Default hive engine was tez. So after firing the query I was not able to see any TEZ application is launched on the yarn resource manager web ui

So I've changed the execution engine to MapReduce.

set hive.execution.engine=mr

And tried to run the same query again. Same I was not able to see any MR application was launched on the yarn resource manager web ui

So my questions are how hive manage such types of queries? And where the details of this queries are stored like application id, start time so on?

Pash0002
  • 110
  • 2
  • 14

1 Answers1

1

create table - is a metadata operation only, data is not being processed. It creates records in the metastore database, no distributed processing framework like Tez or MR is necessary for this, Yarn is not used.

Compiler translates DDL to the metastore query only if possible.

Also some simple DQL queries can be executed as metastore only if statistics exists and this feature is enabled: https://stackoverflow.com/a/41021682/2700344, without using Tez or MR.

Also small tables can be queried without distributed framework, using fetch-only task, see this: Why is Fetch task in Hive works faster than Map-only task?

leftjoin
  • 36,950
  • 8
  • 57
  • 116
  • Thanks. Can we extract those metastore only queries via any API or from HDFS location? – Pash0002 May 07 '21 at 06:47
  • @Pash0002 Please explain what do you mean? There are some APIs: https://cwiki.apache.org/confluence/display/Hive/Hive+APIs+Overview Also depending on the distribution/ vendor you may have additional APIs like provided by Qubole, AWS and others – leftjoin May 07 '21 at 07:45
  • Like for MapReduce and Tez we have the job history server and timeline server REST API respectively. So for the fetch task queries do we have any API? – Pash0002 May 11 '21 at 06:23
  • @Pash0002 I'm not sure is it possible to monitor fetch-only task, it is just `cat` command, no map-reduce involved... But Cloudera says that fetch-only task s log is available on HiveServer https://docs.cloudera.com/cdp-private-cloud-base/7.1.6/managing-hive/topics/hive_query_progress.html – leftjoin May 11 '21 at 07:54