1

Background:

As a part of the Modernization effort, I am trying to convert a big Stored Procedure into HiveQL script. HiveQL script, as a part of hive activity, runs on Azure HDInsight cluster whenever a Pipeline is triggered from Azure Data Factory.

The stored procedure that I am trying to transform has a lot of variables declared using 'DECLARE' statements. For eg:

DECLARE @Variable1 INT;

Values in these variables are Set using SELECT statements. For eg:

SELECT  @Variable1 = ColumnName1 FROM Table_Name;

and these variables are referenced throughout the stored procedure like this:

SELECT * FROM Some_Table where ColumeName < @Variable1

and many complex scenarios where subquerying is not possible.

How can I do the same in HiveQL? Is there any way to declare, modify and use variables in HiveQL script?

leftjoin
  • 36,950
  • 8
  • 57
  • 116
chirag_lad
  • 229
  • 1
  • 16

1 Answers1

2

HiveQL is not procedure language unfortunately.

You cannot use variables like this.

Variables in HiveQL are just text replacement, they are not calculated and being substituted as is literally.

Use some shell script or python, etc or try HPL/SQL

Related answer: https://stackoverflow.com/a/37821218/2700344 and this one: https://stackoverflow.com/a/51492264/2700344 - read for better understanding how variable substitution works in hiveql

leftjoin
  • 36,950
  • 8
  • 57
  • 116