0

I have a query which provides the max_processed_date (it provides a single value) and this date is used to fetch data from master_table.I am using the hivevars to do so, but it is directly substituting the query instead of first evaluating it.

set hivevar max_date= select max_processed_date from Tab_date where process='X';

select * from master_table where process_date > ${hivevar:max_date}

but this performs as select * from master_table where process_date > select max_processed_date from Tab_date where process='X';

which is not supported by hiveql. Is it possible that the hivevar can be evaluated first and this value can be used in the master_table query ?

Dhiraj Tayade
  • 407
  • 3
  • 10
  • 22

1 Answers1

0

The answer mentioned above (HiveQL: Using query results as variables) could be used to put the result of a query into a variable using a Shell script wrapper. But for your particular problem, you could do something like this:

select a.* 
from master_table as a,
  (select max_processed_date from Tab_date where process='X') as b
where a.process_date > b.max_processed_date;

Kindly note that the inner query should be 1 row, or else this will blow up the final table. A distinct in the select statement could resolve it, if it is necessary for the code logic.

Sayon M
  • 146
  • 1
  • 10