0

I'm looking to set a variable like below, called today_date, and then be able to reuse it as a variable throughout the query. The below throws an error.

set today_date = date_format(date_sub(current_date, 1), 'YYYYMMdd')

select account
from table
where data_date = today_date
Matt W.
  • 3,692
  • 2
  • 23
  • 46

2 Answers2

0

First command should end with semicolon:

set today_date=date_format(date_sub(current_date, 1), 'YYYYMMdd');

And variable should be used like this:

select account
from table
where data_date=${hivevar:today_date};

set command will not calculate expression and it will be substituted as is. The resulted query will be

select account
from table
where data_date = date_format(date_sub(current_date, 1), 'YYYYMMdd');

If you want variable to be already calculated, then calculate it in a shell and pass to your Hive script like in this answer: https://stackoverflow.com/a/37821218/2700344

leftjoin
  • 36,950
  • 8
  • 57
  • 116
0

You still need to put a semicolon at the end of the set line, surround your variable with ${} and use the proper namespace.

Note that this will not execute the date_format() function when the variable is defined. When you use the variable the SQL code will just be copied as-is. Think of it as more as a macro than as a variable.

Furthermore, Hive has multiple variable namespaces. The 2 easiest options are either to be less verbose when you define your variable but to be more verbose when you use it (hiveconf namespace):

set today_date = date_format(date_sub(current_date, 1), 'YYYYMMdd');
select account from table where data_date = ${hiveconf:today_date};

or the other way round (hivevar namespace)

set hivevar:today_date = date_format(date_sub(current_date, 1), 'YYYYMMdd');
select account from table where data_date = ${today_date};
Guillaume
  • 2,325
  • 2
  • 22
  • 40