Remove duplicate value on one column based on value of other column

Question

I have a table called Table1 see below

PRODUCT    CUSTOMER               COMP
DICE       DAVES PET SHOP         Billed
DICE       CLAXTONS               ToT
CARDS      VIEWSONIC              NITS
CARDS      NORTHERN LIGHTS        Billed
CARDS      NORTHERN LIGHTS        NITS
BOX        TABLEAU                Billed
BOX        TABLEAU                ToT

There are some values where there is a duplicate in the CUSTOMER field but it will always contain Billed in at least one of the COMP sections if it is a duplicate, so I want the resulting query to only return the value in COMP where it is Billed for the duplicates so the resulting table would look like

PRODUCT    CUSTOMER               COMP
DICE       DAVES PET SHOP         Billed
DICE       CLAXTONS               ToT
CARDS      VIEWSONIC              NITS
CARDS      NORTHERN LIGHTS        Billed
BOX        TABLEAU                Billed

Here is the SQL I tried

SELECT * 
FROM Table1
WHERE COMP = 'Billed'
UNION ALL
SELECT Table1_A.PRODUCT, Table1_A.CUSTOMER, Table1_A.COMP
FROM Table1 Table1_A
LEFT JOIN (
    SELECT * 
    FROM Table1 
    WHERE COMP != 'Billed'
) Table1_B ON Table1_B.PRODUCT = Table1_A.PRODUCT 
    AND Table1_B.CUSTOMER = Table1_A.CUSTOMER

I thought if I put in a UNION and referenced both equal to Billed and not equal to Billed, then it would keep the format I'm looking for but it just repeated every single CUSTOMER for every single COMP

Sorry I added the SQL I tried, and it was just a spelling error when I was inputting the values, there is only one "NORTHERN LIGHTS" I have fixed it — Emitsu, Jun 29 '21 at 23:50
Does this answer your question? [How to delete duplicate rows in SQL Server?](https://stackoverflow.com/questions/18390574/how-to-delete-duplicate-rows-in-sql-server) — Dale K, Jun 29 '21 at 23:51
No I actually saw that did try that as well, but that seems to only work where each column of the whole row is the same, for my issue only two columns of each duplicate row are the same, and the third one is different which is the dependent value I am trying to delete on — Emitsu, Jun 29 '21 at 23:56
Just change the query to suit your use case - its exactly the same principle. The `row_number` approach is the way I would suggest. — Dale K, Jun 29 '21 at 23:56
I don't want to delete it from the original table though I just want a query that will show the resulting table I have above — Emitsu, Jun 30 '21 at 00:00
Use the row number technique in your where clause of your select. — Dale K, Jun 30 '21 at 00:01
https://stackoverflow.com/questions/17221543/filter-duplicate-rows-based-on-a-field — Dale K, Jun 30 '21 at 00:01

score 1 · Accepted Answer · answered Jun 30 '21 at 00:13

1

You could use aggregation for your rules:

SELECT PRODUCT, CUSTOMER, 
       (CASE WHEN COUNT(*) = 1 THEN MAX(COMP) ELSE 'Billed' END)
FROM table1 
GROUP BY PRODUCT, CUSTOMER;

You can also use a UNION ALL approach:

select product, customer, comp
from table1 t1
where comp = 'Billed'
union all
select product, customer, comp
from table1 t1
where not exists (select 1
                  from table1 tt1
                  where tt1.product = t1.product and
                        tt1.customer = t1.customer and
                        tt1.comp = 'Billed'
                 );

answered Jun 30 '21 at 00:13

Gordon Linoff

1,242,037
58
646
786

Wow thanks this worked beautifully, I am a bit confused thought about why you used MAX on a VARCHAR field, does that just mean it takes the lowest value from alphabetical order? And in this case since its the same it doesn't matter which one, we are just using it to select one? – Emitsu Jun 30 '21 at 00:18
1

@Emitsu . . . Basically. Technically it is the smallest value based on the *collation* but that is usually alphabetic ordering, with upper case ordered before lower case. – Gordon Linoff Jun 30 '21 at 00:23

Remove duplicate value on one column based on value of other column

1 Answers1