2

I want to store current_day - 1 in a variable in Hive. I know there are already previous threads on this topic but the solutions provided there first recommends defining the variable outside hive in a shell environment and then using that variable inside Hive.

Storing result of query in hive variable

I first got the current_Date - 1 using

 select date_sub(FROM_UNIXTIME(UNIX_TIMESTAMP(),'yyyy-MM-dd'),1);

Then i tried two approaches:

1. set date1 = ( select date_sub(FROM_UNIXTIME(UNIX_TIMESTAMP(),'yyyy-MM-dd'),1);

and 

2. set hivevar:date1 = ( select date_sub(FROM_UNIXTIME(UNIX_TIMESTAMP(),'yyyy-MM-dd'),1);

Both the approaches are throwing an error:

"ParseException line 1:82 cannot recognize input near 'select' 'date_sub' '(' in expression specification"

When I printed (1) in place of yesterday's date the select query is saved in the variable. The (2) approach throws "{hivevar:dt_chk} is undefined ".

I am new to Hive, would appreciate any help. Thanks.

Raj
  • 1,049
  • 3
  • 16
  • 30

2 Answers2

4

Hive doesn't support a straightforward way to store query result to variables.You have to use the shell option along with hiveconf.

date1 = $(hive -e "set hive.cli.print.header=false; select date_sub(from_unixtime(unix_timestamp(),'yyyy-MM-dd'),1);")
hive -hiveconf "date1"="$date1" -f hive_script.hql

Then in your script you can reference the newly created varaible date1

select '${hiveconf:date1}'
nobody
  • 10,892
  • 8
  • 45
  • 63
0

After lots of research, this is probably the best way to achieve setting a variable as an output of an SQL:

INSERT OVERWRITE LOCAL DIRECTORY '<home path>/config/date1'
select CONCAT('set hivevar:date1=',date_sub(FROM_UNIXTIME(UNIX_TIMESTAMP(),'yyyy-MM-dd'),1)) from <some table> limit 1;
source <home path>/config/date1/000000_0;

You will then be able to use ${date1} in your subsequent SQLs.

Here we had to use <some table> limit 1 as hive got a bug in insert overwrite if we don't specify a table name.

double-beep
  • 5,031
  • 17
  • 33
  • 41