1

I am trying to write a query In hive and I am seeing the following error. "Error while compiling statement:

FAILED: SemanticException Failed to breakup Windowing invocations into Groups. At least 1 group must only depend on input columns. Also check for circular dependencies. Underlying error: Primitve type DATE not supported in Value Boundary expression.

I used the same query in Oracle sql and it works fine. How can I write a valid order by in Hive?

Select   
Email,
FIRST_VALUE(C.abc_cust_id) Over (Partition By Lower(email) Order By C.regt_date
 Desc)As CUSTOMER_ID
from table X
leftjoin
  • 36,950
  • 8
  • 57
  • 116
Maddy
  • 33
  • 1
  • 6

1 Answers1

2

Because some primitive types support (it was no DATE type before) was added after windowing and windowing was not fixed. See HIVE-13973

As a workaround, try to cast DATE as STRING:

Over (Partition By Lower(email) Order By cast(C.regt_date as string) Desc)
leftjoin
  • 36,950
  • 8
  • 57
  • 116
  • Thank you so much! I really appreciate it. I have edited my query to use cast. However, the query runs really slow. Any suggestions as to how can i optimize the cast function. Thanks again – Maddy Apr 23 '21 at 19:08
  • @Maddy I believe it is not CAST which is working slow. It is FIRST_VALUE. 1. Check skew on email. Do you have too many rows with the same emails? 2. Try to increase parallelism on reducer https://stackoverflow.com/a/48487306/2700344 – leftjoin Apr 23 '21 at 19:34