0

I have a view named b_balance which returns the following records:

SECURITIES_CODE   BUY_SELL_FLAG   C_BALANCE   P_BALANCE
---------------   -------------   ---------   ---------
10042             BUY                   200           0 
10042             BUY                   500           0 
10042             SELL                  200           0 
10042             BUY                     0        5000 
10042             SELL                    0        2500 
10043             BUY                   300           0 
10043             SELL                    0        2500

and another view named as t_balance which returns the following records:

SECURITIES_CODE   BUY_SELL_FLAG   C_BALANCE   P_BALANCE
---------------   -------------   ---------   ---------
10042             BUY                     0        5000 
10043             BUY                   300           0 
10042             SELL                  200           0 
10042             SELL                    0        2500 
10043             SELL                    0        2500 
10042             BUY                   200           0 
10042             BUY                   500           0 

Now the problem occurs, when I execute my SQL

SELECT TO_CHAR(to_date('20170801','yyyyMMdd'), 'MM/dd/yyyy') AS TRADE_DATE,
  b.securities_code                                         AS SECURITIES_CODE,
  b.buy_sell_flag                                           AS SIDE,
  SUM(NVL(t.c_balance,0))                                  AS C_t_balance,
  SUM(NVL(b.c_balance,0))                                   AS C_b_balance,
  SUM(NVL(t.c_balance,0)) - SUM(NVL(b.c_balance,0))       AS C_DIFFERENCE,
  SUM(NVL(t.p_balance,0))                                  AS P_t_balance,
  SUM(NVL(b.p_balance,0))                                   AS P_b_balance,
  SUM(NVL(t.p_balance,0)) - SUM(NVL(b.p_balance,0))       AS P_DIFFERENCE
FROM b_balance b
FULL OUTER JOIN t_balance t
ON b.securities_code = t.securities_code
AND b.buy_sell_flag  = t.buy_sell_flag
GROUP BY b.securities_code,
  b.buy_sell_flag
ORDER BY SECURITIES_CODE,
  SIDE ;

this returns the following records:

TRADE_DATE   SECURITIES_CODE   SIDE   C_T_BALANCE   C_B_BALANCE   C_DIFFERENCE   P_T_BALANCE   P_B_BALANCE    P_DIFFERENCE
----------   ---------------   ----   -----------   -----------   ------------   -----------   ------------   ------------
08/01/2017   10042             BUY           2100          2100              0         15000         15000               0 
08/01/2017   10042             SELL           400           400              0          5000          5000               0 
08/01/2017   10043             BUY            300           300              0             0             0               0 
08/01/2017   10043             SELL             0             0              0          2500          2500               0 

that means the result is being multiplied by number of rows. I checked on Stack overflow and did't find anything wrong according to this answer.

So what is the wrong in my SQL?

Thorsten Kettner
  • 89,309
  • 7
  • 49
  • 73
Soumyadip Das
  • 1,781
  • 3
  • 16
  • 34
  • According to the join predicates used, the relationship between the views is `M:M` so the balance is getting duplicated before summing up.. – Pரதீப் Jul 19 '17 at 05:40
  • group by column are from b_balance table and with grouping BY b.securities_code, b.buy_sell_flag result is correct, you look for this: 10043 BUY C_T_BALANCE -> 600 ? then use union (instead of join) before grouping – hvojdani Jul 19 '17 at 05:46
  • 1
    and according this link, use COALESCE instead of NVL is better for performance. https://stackoverflow.com/questions/950084/ – hvojdani Jul 19 '17 at 05:54
  • To debug the query remove the `GROUP BY` clause and the `SUM()` functions (but keep the arguments of the `SUM()` expressions). What you get is the raw data used to generate the groups. The `JOIN` produces too many rows, probably because the `ON` conditions are too loose. Tweak the conditions (`ON`, `WHERE`) until the `JOIN` produces only the rows you need then put back the `GROUP BY` and the `SUM()` calls. – axiac Jul 19 '17 at 06:57

2 Answers2

1

you should join the aggregated result (not aggregate the joined values)

SELECT TO_CHAR(to_date('20170801','yyyyMMdd'), 'MM/dd/yyyy') AS TRADE_DATE,
  t1.SECURITIES_CODE,
  t1.SIDE,
  t1.C_b_balance,
  t1.P_b_balance,
  t2.C_t_balance,
  t2.P_t_balance,
from (
    SELECT
      b.securities_code            AS SECURITIES_CODE,
      b.buy_sell_flag              AS SIDE,
      SUM(NVL(b.c_balance,0))      AS C_b_balance,
      SUM(NVL(b.p_balance,0))      AS P_b_balance,
    FROM b_balance b
    GROUP BY b.securities_code, b.buy_sell_flag ) t1 
left join (
    SELECT 
      t.securities_code            AS SECURITIES_CODE,
      t.buy_sell_flag              AS SIDE,
      SUM(NVL(t.c_balance,0))      AS C_t_balance,
      SSUM(NVL(t.p_balance,0))     AS P_t_balance,
    FROM t_balance t
    GROUP BY t.securities_code, t.buy_sell_flag
) on t1.securities_code = t2.securities_code and t1.buy_sell_flag  = t2.buy_sell_flag
ScaisEdge
  • 131,976
  • 10
  • 91
  • 107
  • 1
    Good answer. But it seems that data can be missing in either table, so OP chose `full outer join`. I see no reason to change this to a mere `left outer join`. Some more points: there is a typo `SSUM`. You have two commas too many (`AS P_b_balance,` and `AS P_t_balance,`). You are missing the t2 qualifier for the second derived table. `NVL` belongs in the main select clause; it is unlikely to get a null sum in the subqueries, but likely to get a null, because of outer joining. – Thorsten Kettner Jul 19 '17 at 06:56
  • There is no need to convert a string literal to date; you can use a date literal instead: `to_char(date '2017-008-01', 'mm/dd/yyyy')`. You are using good table aliases `b` and `t` when you don't need them. In your main query you are using non-mnemonic aliases `t1` and `t2` then. It would be better to use the better names `b` and `t` here. – Thorsten Kettner Jul 19 '17 at 06:58
0

When working with aggregates from different tables, aggregate before joining:

select
  date '2017-08-01'                                 as trade_date,
  securities_code                                   as securities_code,
  buy_sell_flag                                     as side,
  nvl(t.sum_c_balance, 0)                           as c_t_balance,
  nvl(b.sum_c_balance, 0)                           as c_b_balance,
  nvl(t.sum_c_balance, 0) - nvl(b.sum_c_balance, 0) as c_difference,
  nvl(t.sum_p_balance, 0)                           as p_t_balance,
  nvl(b.sum_p_balance, 0)                           as p_b_balance,
  nvl(t.sum_p_balance, 0) - nvl(b.sum_c_balance, 0) as p_difference  
from
(
  select
    securities_code,
    buy_sell_flag,
    sum(c_balance) as sum_c_balance,
    sum(p_balance) as sum_p_balance
  from b_balance
  group by securities_code, buy_sell_flag
) b
full outer join
(
  select
    securities_code,
    buy_sell_flag,
    sum(c_balance) as sum_c_balance,
    sum(p_balance) as sum_p_balance
  from t_balance
  group by securities_code, buy_sell_flag
) t using (securities_code, buy_sell_flag)
order by securities_code, buy_sell_flag;
Thorsten Kettner
  • 89,309
  • 7
  • 49
  • 73