1

I've written a SQL query which produces a table as given below.

The results needs to be grouped into a single line for each business. I've tried a lot of ways but I'm not able to do that. I've tried to insert the final table data into a temporary table and then perform grouping on the temp table but that does not produce the correct output.

;WITH Generate_Dates(Start_Date, End_Date) AS (
  SELECT
    CAST(StartDateUtc AS Date),
    CAST(
      CASE WHEN [bg].EndDateUtc IS NULL THEN GETDATE() ELSE CAST([bg].EndDateUtc AS Date) end AS Date
    )
  FROM
    BusinessGoal AS [bg] WITH (NOLOCK)
  WHERE
    [bg].Period = 'Year'
  UNION ALL
  SELECT
    DATEADD(yy, 1, Start_Date),
    End_Date
  FROM
    Generate_Dates
  WHERE
    DATEADD(yy, 1, Start_Date) < End_Date
)
SELECT
  (
    Generate_Dates.Start_Date
  ) AS "Start Period Date",
  (
    Generate_Dates.End_Date
  ) AS "End Period Date",
  [bg].BusinessRefId AS 'Brand',
  (
    CAST (
      [bg].Amount AS DECIMAL
    )
  ) AS "Committed Transaction Fee",
  (
    SELECT
      CAST (
        sum (
          (
            CASE WHEN [s].Credit = 0 THEN - [s].SettlementAmount ELSE [s].SettlementAmount END
          )
        ) AS DECIMAL
      )
  ) AS "Actual Transaction Fee",
  (
    SELECT
      TOP 1 (
        CASE [s].Culture WHEN 'en-US' THEN 'USD' WHEN 'en-CA' THEN 'CAD' WHEN 'en-IE' THEN 'EUR' WHEN 'en-AU' THEN 'AUD' WHEN 'en-NZ' THEN 'NZD' WHEN 'en-Gbg' THEN 'GbgP' WHEN 'es-MX' THEN 'MXN' WHEN 'jp-JP' THEN 'JPY' WHEN 'nbg-NO' THEN 'NOK' end
      )
    FROM
      Settlement AS [s] WITH (NOLOCK)
  ) AS "Currency"
FROM
  BusinessGoal AS [bg] WITH (NOLOCK)
  INNER JOIN Settlement AS [s] WITH (NOLOCK) ON [bg].BusinessRefId = [s].CompanyRefId
  INNER JOIN OrderProductVariant AS [opv] WITH (NOLOCK) ON
  (
    [opv].MRefId = [s].CompanyRefId
    AND [opv].Id = [s].OrderProductVariantId
  )
  INNER JOIN [Order] AS [o] WITH (NOLOCK) ON [o].Id = [opv].OrderId
  INNER JOIN Generate_Dates ON GoalType = 'CommittedTransactionFee'
WHERE
    StartDateUtc <= [s].CreatedOnUtc
    AND (
      (
        EndDateUtc >= [s].CreatedOnUtc
      )
      OR (EndDateUtc is null)
    )
GROUP BY
  Start_Date,
  End_Date,
  BusinessRefId,
  Amount

OUTPUT:

Start Period Date End Period Date brand   Committed Transaction Fee 
2019-06-07        2019-08-26  NIKE        45000
2019-06-07        2019-08-26  ADIDAS      18000 
2019-06-07        2019-08-26  ADIDAS      18000 
2019-06-14        2019-08-26  NIKE        45000
2019-06-14        2019-08-26  ADIDAS      18000
2019-06-14        2019-08-26  ADIDAS      18000 
2019-07-25        2019-08-26  NIKE        45000
2019-07-25        2019-08-26  ADIDAS      18000
2019-07-25        2019-08-26  ADIDAS      18000
  • 2
    MySQL or SQL Server? – mkRabbani Aug 26 '19 at 07:11
  • 3
    MySQL <> SQL Server. Tag appropriate RDBMS only, when posting a question, instead of spamming with different tags. – Madhur Bhaiya Aug 26 '19 at 07:11
  • 2
    Make it easy to assist you: [mcve] – jarlh Aug 26 '19 at 07:12
  • @mkRabbani SQL Server – Shantanu Dwivedi Aug 26 '19 at 07:13
  • 1
    Please add sample data and your expected output from them. This will help. – mkRabbani Aug 26 '19 at 07:16
  • 3
    If you want one output row per business, you need to group by `BusinessRefId`. Then you need to decide how you want the other columns to be summarised. Should they be averaged, summed etc? – Daniel Earwicker Aug 26 '19 at 07:17
  • @DanielEarwicker I'm grouping it by BusinessRefId bit it is giving me the same output. – Shantanu Dwivedi Aug 26 '19 at 07:20
  • 2
    According to your posted query, you are grouping it by five columns! You need to group it by the one single column you want to group it by. – Daniel Earwicker Aug 26 '19 at 07:21
  • 2
    Grouping by 5 columns will create an output row for every distinct (unique) combination of values in those columns. – Daniel Earwicker Aug 26 '19 at 07:22
  • 2
    Possible duplicate of [How does GROUP BY work?](https://stackoverflow.com/questions/7434657/how-does-group-by-work) – Daniel Earwicker Aug 26 '19 at 07:25
  • 1
    Some side notes: **(1)** Don't place `DISTINCT` in a query where it makes no sense. Always consider why exactly you think you must apply `DISTINCT`. `DISTINCT` is a typical indicator for a flawed query. **(2)** The delimiter for aliases is `"` in standard SQL and `[]` in your non-compliant DBMS. It's not the single quote `'`. **(3)** `SUM` gets you a single value. How can you select the `TOP 1` of a single value? That makes no sense. **(4)** Why all those parentheses? **(5)** Why all those `WITH (NOLOCK)`. Are you in a desperate need to improve performance at the risk of dirty reads? – Thorsten Kettner Aug 26 '19 at 07:41
  • 1
    I think you are asking the wrong question. Why dont you post some sample data, and expected outcome. Most probably there will be another and better way to accomplish what you are after – GuidoG Aug 26 '19 at 07:49

1 Answers1

2

It depends on the output you want to produce.

If you want one row per business, but also want to return the additional columns (Start Period Date, End Period Date, brand, Committed Transaction Fee), you probably won't be able to use a GROUP BY since you'll need to include those columns in your GROUP BY clause.

You can use a QUALIFY function to do this and still return the columns, but you have to decide how to "pick" each row to be returned. For example:

QUALIFY ROW_NUMBER() OVER(
  PARTITION BY [bg].BusinessRefId 
  ORDER BY Generate_Dates.Start_Date DESC
) = 1

This does the following:

  1. Split the result set rows into groups based on [bg].BusinessRefId
  2. Order the rows within those groups based on Generate_Dates.Start_Date
  3. Return the first row of each group

You can replace your GROUP BY clause with the QUALIFY.

WITH Generate_Dates(Start_Date, End_Date) AS (
  SELECT
    CAST(StartDateUtc AS Date),
    CAST(
      CASE 
        WHEN [bg].EndDateUtc IS NULL THEN GETDATE() 
        ELSE CAST([bg].EndDateUtc AS Date) 
      END AS DATE
    )
  FROM BusinessGoal AS [bg] WITH (NOLOCK)
  WHERE [bg].Period = 'Year'

  UNION ALL

  SELECT
    DATEADD(yy, 1, Start_Date),
    End_Date
  FROM Generate_Dates
  WHERE DATEADD(yy, 1, Start_Date) < End_Date
)
SELECT DISTINCT
  Generate_Dates.Start_Date AS 'Start Period Date',
  Generate_Dates.End_Date AS 'End Period Date',
  [bg].BusinessRefId AS 'brand',
  CAST ([bg].Amount AS DECIMAL) AS 'Committed Transaction Fee',
  (
    SELECT
      TOP 1 CAST (
        SUM (
            CASE 
              WHEN [s].Credit = 0 THEN - [s].SettlementAmount 
              ELSE [s].SettlementAmount 
            END
        ) AS DECIMAL
      )
  ) AS 'Actual Transaction Fee',
  (
    SELECT
      TOP 1 (
        CASE [s].Culture 
          WHEN 'en-US' THEN 'USD' 
          WHEN 'en-CA' THEN 'CAD' 
          WHEN 'en-IE' THEN 'EUR' 
          WHEN 'en-AU' THEN 'AUD' 
          WHEN 'en-NZ' THEN 'NZD' 
          WHEN 'en-Gbg' THEN 'GbgP' 
          WHEN 'es-MX' THEN 'MXN' 
          WHEN 'jp-JP' THEN 'JPY' 
          WHEN 'nbg-NO' THEN 'NOK'
        END
      )
    FROM Settlement AS [s] WITH (NOLOCK)
  ) AS 'Currency'
FROM BusinessGoal AS [bg] WITH (NOLOCK)
INNER JOIN Settlement AS [s] WITH (NOLOCK) ON [bg].BusinessRefId = [s].CompanyRefId
INNER JOIN OrderProductVariant AS [opv] WITH (NOLOCK) ON
  (
    [opv].MRefId = [s].CompanyRefId
    AND [opv].Id = [s].OrderProductVariantId
  )
INNER JOIN [Order] AS [o] WITH (NOLOCK) ON [o].Id = [opv].OrderId
INNER JOIN Generate_Dates ON GoalType = 'CommittedTransactionFee'
WHERE StartDateUtc <= [s].CreatedOnUtc
AND (EndDateUtc >= [s].CreatedOnUtc OR EndDateUtc IS NULL)
QUALIFY ROW_NUMBER() OVER(
  PARTITION BY [bg].BusinessRefId 
  ORDER BY Generate_Dates.Start_Date DESC
) = 1
--GROUP BY Start_Date, End_Date, BusinessRefId, Amount, [s].Culture

Also, like the previous poster mentioned, you don't need the DISTINCT in there. If you're doing a GROUP BY without any aggregate functions, it's logically equivalent to doing a DISTINCT.

I haven't tested it, so not sure if it works...but something to try.

ravioli
  • 3,749
  • 3
  • 14
  • 28