Current Status
We have a query that runs for 2+ hours. On examining the progress, the query spends a lot of time during the join with table T5 and during the final stage of the query. Is there any way we can simplify of do something about this query? I was unable to use aggregate functions in place of rank() as the orderby used is bit complicated.
What we have already tried
We have already converted the sub-queries to case statements in the select clause and helped reduce the execution time but that was not significant. We have simplified co-related query for T3, T4 and T6.
SELECT * FROM
(SELECT T2.f1, T2.f2 .... T5.f19, T5.f20,
case when T1.trxn_id is null then T2.crt_ts
when T1.trxn_id is not null and T5.acct_trxn_id is not null and T2.crt_ts >= T5.crt_ts then T2.crt_ts
when T1.trxn_id is not null and T5.acct_trxn_id is not null and T2.crt_ts < T5.crt_ts then T5.crt_ts
end as crt_ts ,
row_number() over ( partition by T2.w_trxn_id,
if(T1.trxn_id is null, 'NULL', T1.trxn_id)
order by T2.business_effective_ts desc,
case when T1.trxn_id is null then T2.crt_ts
when T1.trxn_id is not null and T5.acct_trxn_id is not null and T2.crt_ts >= T5.crt_ts then T2.crt_ts
when T1.trxn_id is not null and T5.acct_trxn_id is not null and T2.crt_ts < T5.crt_ts then T5.crt_ts
when T1.trxn_id is not null and T5.acct_trxn_id is null then T2.crt_ts end desc
) as rnk
FROM(SELECT * FROM T3 WHERE title_name = 'CAPTURE' and tr_dt IN (SELECT tr_dt FROM DT_LKP))
T2
LEFT JOIN (SELECT * FROM T6 WHERE tr_dt IN (SELECT tr_dt FROM DT_LKP))
T1 ON T2.w_trxn_id = T1.w_trxn_id AND T2.business_effective_ts = T1.business_effective_ts
LEFT JOIN (SELECT f1, f3. ... f20 FROM T4 WHERE tr_dt IN (SELECT tr_dt FROM DT_LKP))
T5 ON T1.trxn_id = T5.acct_trxn_id
WHERE if(T1.trxn_id is null, 'NULL', T1.trxn_id) = if(T5.acct_trxn_id is null, 'NULL', T5.acct_trxn_id)
) FNL WHERE rnk = 1