0

I am using Teradata database but I am quite new to its functioning. Could you please help me with making the below query more efficient so that it does not yield 'no more spool space' error? It is getting too heavy after I add the 2nd join.

SELECT
  a.src_cmpgn_code,
  a.cmc_name,
  SUM(b.open_cnt)
FROM access_views.dw_cmc_lkp a
LEFT JOIN prs_restricted_v.mh_crm_engmnt_sd b
  ON b.cmpgn_id = a.cmc_id
LEFT JOIN access_views.dw_cmc_instnc c
  ON b.cmpgn_id = c.cmc_id
WHERE 1=1
  AND b.trigger_dt BETWEEN '2019-01-01' AND '2019-12-31'
  AND b.site_cntry_id = 1
  AND a.cmpgn_group_name IN ('a', 'b', 'c', 'd')
  AND c.dlvry_vhcl_id IN (1, 10)
  AND c.chnl_id = 1
GROUP BY 1,2;

Explain looks like this:

This query is optimized using type 2 profile Cost_NoSlidingJ_Profile, profileid 10007. 1) First, we lock mdm_tables.DW_CMC_INSTNC in view access_views.dw_cmc_instnc for access, we lock MDM_TABLES.DW_CMC_LKP in view access_views.dw_cmc_lkp for access, and we lock PRS_T.MH_CRM_ENGMNT_SD in view prs_restricted_v.mh_crm_engmnt_sd for access. 2) Next, we do an all-AMPs RETRIEVE step from 365 partitions of PRS_T.MH_CRM_ENGMNT_SD in view prs_restricted_v.mh_crm_engmnt_sd with a condition of ("(NOT (PRS_T.MH_CRM_ENGMNT_SD in view prs_restricted_v.mh_crm_engmnt_sd.CMPGN_ID IS NULL )) AND (((PRS_T.MH_CRM_ENGMNT_SD in view prs_restricted_v.mh_crm_engmnt_sd.TRIGGER_DT <= DATE '2019-12-31') AND (PRS_T.MH_CRM_ENGMNT_SD.TRIGGER_DT >= DATE '2019-01-01')) AND (PRS_T.MH_CRM_ENGMNT_SD in view prs_restricted_v.mh_crm_engmnt_sd.SITE_CNTRY_ID = 1. ))") into Spool 4 (all_amps), which is redistributed by the hash code of ( PRS_T.MH_CRM_ENGMNT_SD.CMPGN_ID) to all AMPs. The size of Spool 4 is estimated with no confidence to be 329,656,959 rows ( 7,582,110,057 bytes). The estimated time for this step is 2.40 seconds. 3) We do an all-AMPs JOIN step from MDM_TABLES.DW_CMC_LKP in view access_views.dw_cmc_lkp by way of an all-rows scan with a condition of ("MDM_TABLES.DW_CMC_LKP in view access_views.dw_cmc_lkp.CMPGN_GROUP_NAME IN ('Bucks_Nectar_eBayPlus', 'DailyDeal','Other','STEP_User_Agreement')"), which is joined to Spool 4 (Last Use) by way of an all-rows scan. MDM_TABLES.DW_CMC_LKP and Spool 4 are joined using a single partition hash join, with a join condition of ("CMPGN_ID = MDM_TABLES.DW_CMC_LKP.CMC_ID"). The result goes into Spool 5 (all_amps) fanned out into 5 hash join partitions, which is built locally on the AMPs. The size of Spool 5 is estimated with no confidence to be 79,119,821 rows (10,681,175,835 bytes). The estimated time for this step is 0.19 seconds. 4) We do an all-AMPs RETRIEVE step from mdm_tables.DW_CMC_INSTNC in view access_views.dw_cmc_instnc by way of an all-rows scan with a condition of ("(mdm_tables.DW_CMC_INSTNC in view access_views.dw_cmc_instnc.DLVRY_VHCL_ID IN (1 , 10 )) AND ((mdm_tables.DW_CMC_INSTNC in view access_views.dw_cmc_instnc.CHNL_ID = 1) AND (mdm_tables.DW_CMC_INSTNC in view access_views.dw_cmc_instnc.TRTMNT_TYPE_CODE <> 'I'))") into Spool 6 (all_amps) fanned out into 5 hash join partitions, which is redistributed by the hash code of ( mdm_tables.DW_CMC_INSTNC.CMC_ID) to all AMPs. The size of Spool 6 is estimated with no confidence to be 2,874,675 rows (48,869,475 bytes). The estimated time for this step is 0.58 seconds. 5) We do an all-AMPs JOIN step from Spool 5 (Last Use) by way of an all-rows scan, which is joined to Spool 6 (Last Use) by way of an all-rows scan. Spool 5 and Spool 6 are joined using a hash join of 5 partitions, with a join condition of ("(CMPGN_ID = CMC_ID) AND (CMC_ID = CMC_ID)"). The result goes into Spool 3 (all_amps), which is built locally on the AMPs. The size of Spool 3 is estimated with no confidence to be 5,353,507,625 rows ( 690,602,483,625 bytes). The estimated time for this step is 14.82 seconds. 6) We do an all-AMPs SUM step to aggregate from Spool 3 (Last Use) by way of an all-rows scan , grouping by field1 ( MDM_TABLES.DW_CMC_LKP.SRC_CMPGN_CODE ,MDM_TABLES.DW_CMC_LKP.CMC_NAME). Aggregate Intermediate Results are computed globally, then placed in Spool 7. The size of Spool 7 is estimated with no confidence to be 11,774 rows (5,286,526 bytes). The estimated time for this step is 24.51 seconds. 7) We do an all-AMPs RETRIEVE step from Spool 7 (Last Use) by way of an all-rows scan into Spool 1 (group_amps), which is built locally on the AMPs. The size of Spool 1 is estimated with no confidence to be 11,774 rows (2,837,534 bytes). The estimated time for this step is 0.01 seconds. 8) Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request. -> The contents of Spool 1 are sent back to the user as the result of statement 1. The total estimated time is 42.50 seconds.

Pawels
  • 192
  • 2
  • 12
  • 2
    The table b and c conditions in the WHERE clause make the LEFT JOIN's return regular INNER JOIN result. Move them to ON, if you want true LEFT JOIN result. – jarlh Feb 07 '20 at 10:59
  • Why _LEFT JOIN_ c, when you don't use any of its columns? – jarlh Feb 07 '20 at 11:01
  • What are the relations between those tables, which columns are unique, cmpgn_id, cmc_id? Are those views based on joins or a single table? Can you show Explain? – dnoeth Feb 07 '20 at 11:13
  • @dnoeth Added explain – Pawels Feb 07 '20 at 11:22
  • 2
    Optimizer changed Outer to Inner Joins (as expected). Seems there are no usable statistics (all steps show "no confidence"). Based on the estimated 5,353,507,625 rows the join to dw_cmc_instnc is many-many instead of 1-many. Are you sure this will return the correct result? – dnoeth Feb 07 '20 at 13:47
  • @dnoeth The row count is definitely nonsense. The first two tables joined yield a result at about 500 rows and I expected less after joining dw_cmc_instnc and introducing conditions based on it in the WHERE clause. – Pawels Feb 07 '20 at 14:18
  • @jarlh changed the joins, thank you – Pawels Feb 07 '20 at 14:27
  • 1
    Can you collect stats? With better estimates join planning might change. – dnoeth Feb 07 '20 at 15:08
  • @dnoeth I finally got through using volatile tables and inner joining them to avoid multiplication. Thanks for the ideas. – Pawels Feb 10 '20 at 15:08

0 Answers0