How to exclude data in sql when a particular combination of columns determine its duplicativeness?

Question

I have a dataset which as far as I know is correct. It looks as below.

    QuoteStatus Quoteid batchID EffDate Iteration   Months  Revenue
1   Block Ready 275576  900265  3/1/2019    1      1096     635791
2   Block Ready 275654  900265  3/1/2019    1      1096     635791
3   Sold        275654  900265  3/1/2019    2      1096     635791

However, I have a requirement that when a certain combination of columns put together, each record has to be unique.

From the resultset – I see this as a duplicate record because the combination of QuoteStatus = ‘Block Ready’ and Iteration = ‘1’ repeats more than once. (Even though QuoteID is distinct) I do not want row 1 to appear and I cannot simply remove “QuoteID” from Select statement because I want it to display .

I have tried ROW_NUMBER ( ) OVER ( PARTITION BY QuoteId Order by QuoteID ) as ROW_Partition. However the result is not what i am looking for. I cannot exlclude ROW_Partition = '1' because that would still leave row 2.

enter image description here

    QuoteStatus Quoteid batchID EffDate Iteration   ROW_Partition   Months  Revenue
1   Block Ready 275576  900265  3/1/2019    1           1           1096    635791
2   Block Ready 275576  900265  3/1/2019    1           2           1096    635791
3   Block Ready 275576  900265  3/1/2019    1           3           1096    635791
4   Block Ready 275576  900265  3/1/2019    1           4           1096    635791
5   Block Ready 275654  900265  3/1/2019    1           1           1096    635791
6   Block Ready 275654  900265  3/1/2019    1           2           1096    635791
7   Block Ready 275654  900265  3/1/2019    1           3           1096    635791
8   Block Ready 275654  900265  3/1/2019    1           4           1096    635791
9   Sold        275654  900265  3/1/2019    2           5           1096    635791
10  Sold        275654  900265  3/1/2019    2           6           1096    635791
11  Sold        275654  900265  3/1/2019    2           7           1096    635791
12  Sold        275654  900265  3/1/2019    2           8           1096    635791

I want to remove row 1 without explicitly excluding the QuoteID = ‘275576’ because this could happen to many other Quotes.

Please include the SQL and the data as formatted text in your Question, as that will help us be able to figure out what's going on. — computercarguy, Aug 02 '19 at 20:21
Post sample data and the expected results as text (not images) and explain better the conditions you wan to apply. — forpas, Aug 02 '19 at 20:21

Gordon Linoff · Accepted Answer · 2019-08-02T23:33:49.187

2

I think you want to calculate the row number based on the columns you want to be distinct, keeping in mind that to filter on ROW_NUMBER you'll need to wrap it in a subquery or CTE (SQL Row_Number() function in Where Clause):

ROW_NUMBER() OVER (PARTITION BY QuoteStatus, Iteration ORDER BY QuoteID) as seqnum

You can then filter on this value.

edited Aug 02 '19 at 23:33

answered Aug 02 '19 at 20:44

Gordon Linoff

1,242,037
58
646
786

Hi @Gordon Linoff silly question here...can i use the 'seqnum' to filter out results in the same query? Thanks. – i.am.satv Aug 02 '19 at 21:05
@i.am.satishv . . . You need a subquery or CTE. – Gordon Linoff Aug 02 '19 at 21:08
Thanks for helping me out, I will close this post now. Looks like using ROW_NUMBER() is the way to go however I could not get to exclude the 4 records which have Quote Status = 'Block Ready' and QuoteID = '275576' without using Quoteid as filter. – i.am.satv Aug 02 '19 at 21:33

How to exclude data in sql when a particular combination of columns determine its duplicativeness?

1 Answers1