0

How can I set up this left join so that it does not duplicate my calls in Table2 or my sales in Table3?

Left joining to either Table2 or Table3 returns accurate answers, left joining to both in the same query duplicates - while I have three calls and sales of $444, it returns 6 calls (the three records of Table2 x the two records of Table3) and Sales of $1332.

Using SQL Server Studio 17.

CREATE TABLE #Table1 (
    CommercialTime datetime,
    NextCommercialTime datetime,
    Station varchar(4),
    Cost int    )
INSERT INTO #Table1 (
    CommercialTime,
    NextCommercialTime,
    Station,
    Cost    )
VALUES
(   '2020-04-06 12:01:00.000',
    '2020-04-06 12:15:00.000',
    'ZZZZ',
    '9999'  )

CREATE TABLE #Table2 (
    CallTime datetime,
    Station varchar(4),
    CallCount int   )
INSERT INTO #Table2 (
    CallTime,
    Station,
    CallCount   )
VALUES
(   '2020-04-06 12:02:00.000',
    'ZZZZ',
    '1' ),
(   '2020-04-06 12:05:00.000',
    'ZZZZ',
    '1' ),
(   '2020-04-06 12:07:00.000',
    'ZZZZ',
    '1' )

CREATE TABLE #Table3 (
    SaleTime datetime,
    Station varchar(4),
    SaleAmount int  )
INSERT INTO #Table3 (
    SaleTime,
    Station,
    SaleAmount  )
VALUES
(   '2020-04-06 12:04:00.000',
    'ZZZZ',
    '123'   ),
(   '2020-04-06 12:07:00.000',
    'ZZZZ',
    '321'   )

SELECT 
          one.Station
        , SUM(two.Callcount) as Calls
        , SUM(three.SaleAmount) as Sales
    FROM #Table1 one WITH(NOLOCK)
    LEFT JOIN #Table2 two   WITH(NOLOCK)    ON one.Station = two.Station 
                                                AND two.CallTime between one.CommercialTime and one.NextCommercialTime      
    LEFT JOIN #Table3 three WITH(NOLOCK)    ON one.Station = three.Station 
                                                AND three.SaleTime between one.CommercialTime and one.NextCommercialTime    
    GROUP BY one.Station

--Output:
    Station Calls   Sales
    ZZZZ    6   1332
DonnaC
  • 17
  • 1
  • 1
    This is a common error where people want some joins, each possibly involving a different key, of some subqueries, each possibly involving join and/or aggregation, but they erroneously try to do all the joining then all the aggregating or to aggregate over previous aggregations. Write separate sums over appropriate rows and/or sum a case statement picking rows; join on common unique column sets. PS In a [mre]: Find the first subexpression that is code that you can show does what you expect extended by code that doesn't do what you expect. Then explain what you expect & why. – philipxy Apr 22 '20 at 03:54
  • This is a faq. Before considering posting please read the manual & google any error message or many clear, concise & precise phrasings of your question/problem/goal, with & without your particular strings/names & site:stackoverflow.com & tags; read many answers. If you post a question, use one phrasing as title. Reflect your research. See [ask] & the voting arrow mouseover texts. We cannot reason, communicate or search unless we make the effort to (re-re-re-)write clearly. – philipxy Apr 22 '20 at 03:59
  • Does this answer your question? [Strange duplicate behavior from GROUP\_CONCAT of two LEFT JOINs of GROUP\_BYs](https://stackoverflow.com/questions/45250646/strange-duplicate-behavior-from-group-concat-of-two-left-joins-of-group-bys) – philipxy Apr 22 '20 at 03:59

2 Answers2

1

You can use outer apply :

SELECT one.Station, ISNULL(t2.calls, 0) AS Calls, ISNULL(t3.sales, 0) AS Sales
FROM #Table1 one WITH(NOLOCK) OUTER APPLY
     ( SELECT SUM(two.Callcount) as calls
       FROM #Table2 two
       WHERE one.Station = two.Station AND 
             two.CallTime between one.CommercialTime and one.NextCommercialTime      
     ) t2 OUTER APPLY
     ( SELECT SUM(three.SaleAmount) as sales
        FROM #Table3 three
        WHERE one.Station = three.Station AND 
              three.SaleTime between one.CommercialTime and one.NextCommercialTime    
     ) t3;
Yogesh Sharma
  • 49,870
  • 5
  • 26
  • 52
  • Do you consider that *better* or just *different* to using a sub-query? – Dale K Apr 22 '20 at 04:57
  • 1
    @DaleK. . . In this case, this would be equivalent to using subqueries if it used `OUTER APPLY` (as it is, it filters out rows if there is no match in either table). However, lateral joins are much more powerful than subqueries. And some people prefer to keep table references in the `FROM` clause rather than scattered throughout the query. In the context of this question, the subqueries are more accurate, because the question explicitly mentioned `LEFT JOIN`s. – Gordon Linoff Apr 22 '20 at 12:19
0

For this problem I suggest using sub-queries rather than joins e.g.

SELECT 
  one.Station
  , (
    select sum(CallCount)
    from #Table2 two
    where one.Station = two.Station 
    AND two.CallTime between one.CommercialTime and one.NextCommercialTime
  ) Calls
  , (
    select sum(SaleAmount)
    from #Table3 three
    where one.Station = three.Station 
    AND three.SaleTime between one.CommercialTime and one.NextCommercialTime
  ) Sales
FROM #Table1 one

The reason being is that a join gives you every combination of rows which isn't what you want. To explain, when you join Table2 onto Table1 on you get 3 rows:

Station TwoCallTime
ZZZZ    2020-04-06 12:02:00.000
ZZZZ    2020-04-06 12:05:00.000
ZZZZ    2020-04-06 12:07:00.000

So then when you join on Table3, which has 2 rows, you now have 6 rows:

Station TwoCallTime             ThreeCallTime
ZZZZ    2020-04-06 12:02:00.000 2020-04-06 12:04:00.000
ZZZZ    2020-04-06 12:02:00.000 2020-04-06 12:07:00.000
ZZZZ    2020-04-06 12:05:00.000 2020-04-06 12:04:00.000
ZZZZ    2020-04-06 12:05:00.000 2020-04-06 12:07:00.000
ZZZZ    2020-04-06 12:07:00.000 2020-04-06 12:04:00.000
ZZZZ    2020-04-06 12:07:00.000 2020-04-06 12:07:00.000

Which is not what you are trying to accomplish here.

Dale K
  • 25,246
  • 15
  • 42
  • 71