I am having a Hive query like the one below:
select a.x as column from table1 a where a.y in (<long comma-separated list of parameters>)
union all
select b.x as column from table2 b where b.y in (<long comma-separated list of parameters>)
I have set hive.exec.parallel
as true
which is helping me achieve parallelism between the two queries between union all.
But, my IN
clause has many comma separated values and each value is taken once in 1 job and then the next value. This is actually getting executed sequentially.
Is there any hive parameter which if enabled can help me fetch data parallelly for the parameters in the IN
clause?
Currently, the solution I am having is fire the select query with =
multiple times instead of one IN
clause.