I have 2 datasets used for billing purposes and I've been asked to provide a report that summarises both. I'm having an issue where the system shows some of the billing in both places, as one side bills on Part No and Sales Order, and the other on Project and Customer, and I'm struggling to remove the duplicate.
The table post union is in the following format
REF | CUSTOMER | NETT_PRICE | DATE_CREATED | PART_NO | SALESORDER | PROJECT |
---|---|---|---|---|---|---|
AC | ABC | 100 | 20/3/23 | 123 | 7001 | PR01 |
AC | ABC | 150 | 20/3/23 | 123 | 7001 | PR01 |
AC | ABC | 170 | 20/3/23 | 123 | 7002 | PR02 |
FP | ABC | 100 | 20/3/23 | - | - | PR01 |
FP | ABC | 150 | 20/3/23 | - | - | PR01 |
FP | ABC | 200 | 20/3/23 | - | - | PR02 |
In the case of lines 1&4, and 2&5, they are the same billing as the NETT_PRICE, PROJECT and DATE_CREATED are the same, where REF dictates which area the billing came from. Where it's duplicated it has to be the AC reference that remains, and the order isn't always AC first.
My attempts at using ROW_NUMBER with OVER have not been successful, as shown below. Any guidance on this would be appreciated, or if ROW_NUMBER is not the way to go.
SELECT [REF]
,[NETT_PRICE]
,[QTY_ENTERED]
,[CREATED]
,[CUS]
,[SALESORDER]
,[PART_ONLY]
,ROW_NUMBER() over (partition by [NETT_PRICE], [QTY_ENTERED], [CREATED], [CUS], [PART_ONLY]
order by [CREATED], [OUR], [REF]
)
FROM [UDEF_SALESAREA_SUMMARY_TBL] WHERE ROW_NUMBER() > 1
From the above table, the expected output should be:
REF | CUSTOMER | NETT_PRICE | DATE_CREATED | PART_NO | SALESORDER | PROJECT |
---|---|---|---|---|---|---|
AC | ABC | 100 | 20/3/23 | 123 | 7001 | PR01 |
AC | ABC | 150 | 20/3/23 | 123 | 7001 | PR01 |
AC | ABC | 170 | 20/3/23 | 123 | 7002 | PR02 |
FP | ABC | 200 | 20/3/23 | - | - | PR02 |
Where the FP duplicates on NETT_PRICE, PROJECT and DATE_CREATED are removed.