0

I have 2 datasets used for billing purposes and I've been asked to provide a report that summarises both. I'm having an issue where the system shows some of the billing in both places, as one side bills on Part No and Sales Order, and the other on Project and Customer, and I'm struggling to remove the duplicate.

The table post union is in the following format

REF CUSTOMER NETT_PRICE DATE_CREATED PART_NO SALESORDER PROJECT
AC ABC 100 20/3/23 123 7001 PR01
AC ABC 150 20/3/23 123 7001 PR01
AC ABC 170 20/3/23 123 7002 PR02
FP ABC 100 20/3/23 - - PR01
FP ABC 150 20/3/23 - - PR01
FP ABC 200 20/3/23 - - PR02

In the case of lines 1&4, and 2&5, they are the same billing as the NETT_PRICE, PROJECT and DATE_CREATED are the same, where REF dictates which area the billing came from. Where it's duplicated it has to be the AC reference that remains, and the order isn't always AC first.

My attempts at using ROW_NUMBER with OVER have not been successful, as shown below. Any guidance on this would be appreciated, or if ROW_NUMBER is not the way to go.

SELECT [REF]
      ,[NETT_PRICE]
      ,[QTY_ENTERED]
      ,[CREATED]
      ,[CUS]
      ,[SALESORDER]
      ,[PART_ONLY]
      ,ROW_NUMBER() over (partition by [NETT_PRICE], [QTY_ENTERED], [CREATED], [CUS], [PART_ONLY]
                            order by [CREATED], [OUR], [REF]
                        )
FROM [UDEF_SALESAREA_SUMMARY_TBL] WHERE ROW_NUMBER() > 1 

From the above table, the expected output should be:

REF CUSTOMER NETT_PRICE DATE_CREATED PART_NO SALESORDER PROJECT
AC ABC 100 20/3/23 123 7001 PR01
AC ABC 150 20/3/23 123 7001 PR01
AC ABC 170 20/3/23 123 7002 PR02
FP ABC 200 20/3/23 - - PR02

Where the FP duplicates on NETT_PRICE, PROJECT and DATE_CREATED are removed.

Saiyanthou
  • 142
  • 8

1 Answers1

1

How about something like this:

select *
from (
    SELECT  *
    , row_number() OVER(partition BY customer, NETT_PRICE, date_created ORDER BY CASE WHEN ref = 'ac' THEN 0 ELSE 1 END) AS sort
    FROM    (
        VALUES  (N'AC', N'ABC', 100, N'20/3/23', 123, 7001, N'PR01')
        ,   (N'AC', N'ABC', 150, N'20/3/23', 123, 7001, N'PR01')
        ,   (N'AC', N'ABC', 170, N'20/3/23', 123, 7002, N'PR02')
        ,   (N'FP', N'ABC', 100, N'20/3/23', NULL, NULL, N'PR01')
        ,   (N'FP', N'ABC', 150, N'20/3/23', NULL, NULL, N'PR01')
        ,   (N'FP', N'ABC', 200, N'20/3/23', NULL,NULL, N'PR02')
    ) t (REF,CUSTOMER,NETT_PRICE,DATE_CREATED,PART_NO,SALESORDER,PROJECT)
) x
where x.sort = 1

This "groups" every sales by customer, price and date_created and prioritizes AC ref by doing orderby. You can add more tie-breakers to order by in case there are more than one REFs.

siggemannen
  • 3,884
  • 2
  • 6
  • 24
  • This looks to do the trick, thank you. I have limited access to the actual data this morning, but seems to look okay on the test data I have. – Saiyanthou Mar 31 '23 at 07:19