0

I have a table with two possible unique identifiers (ID1 and ID2). Each row will have one or both of these identifiers. The data in each row is exactly the same for each ID, with the exception of a timestamp. I would like to eliminate duplicates from each value, but treat nulls as unique values.

This question: How to delete duplicate rows in sql server?

Referred me to this site: http://www.codaffection.com/sql-server-article/delete-duplicate-rows-in-sql-server/

where I came up with the following query:

WITH CTE AS
(
SELECT *,ROW_NUMBER() OVER (PARTITION BY ID1 ORDER BY ID1) AS RN
FROM Filings_Search
)

DELETE FROM CTE WHERE RN<>1

Unfortunately, this deleted all my null values as well! How can I modify this query to avoid deleting null values?

Edit: Here is a sample of what my data would look like (if anyone knows how to format tables well, let me know. I used https://senseful.github.io/text-table/).

+------+------+----------+-----------+
| ID1  | ID2  |   Data   | Timestamp |
+------+------+----------+-----------+
| NULL | abc  | macd     | 01:40     |
| NULL | abc  | macd     | 04:23     |
| NULL | def  | pfchangs | 01:41     |
| 123  | NULL | wendys   | 02:42     |
| 123  | NULL | wendys   | 03:45     |
+------+------+----------+-----------+

Running on ID1 would output:

+------+------+----------+-----------+
| ID1  | ID2  |   Data   | Timestamp |
+------+------+----------+-----------+
| NULL | abc  | macd     | 01:40     |
| NULL | abc  | macd     | 04:23     |
| NULL | def  | pfchangs | 01:41     |
| 123  | NULL | wendys   | 02:42     |
+------+------+----------+-----------+

Running on ID2 would output:

+------+------+----------+-----------+
| ID1  | ID2  |   Data   | Timestamp |
+------+------+----------+-----------+
| NULL | abc  | macd     | 01:40     |
| NULL | def  | pfchangs | 01:41     |
| 123  | NULL | wendys   | 02:42     |
| 123  | NULL | wendys   | 03:45     |
+------+------+----------+-----------+

Apologies if this is a duplicate, I am a SQL beginner and couldn't find anything exactly like what I was looking for.

Juan Carlos Oropeza
  • 47,252
  • 12
  • 78
  • 118
tuxtuxtux
  • 75
  • 10
  • 3
    Show us db schema, sample data, current and expected output. Please read [**How-to-Ask**](http://stackoverflow.com/help/how-to-ask) And here is a great place to [**START**](http://spaghettidba.com/2015/04/24/how-to-post-a-t-sql-question-on-a-public-forum/) to learn how improve your question quality and get better answers. – Juan Carlos Oropeza Feb 05 '19 at 17:28
  • What were the NULL values? Right now isnt clear what is the data and what row you want delete and which one you want to keep. So we can't really help you beside guessing. – Juan Carlos Oropeza Feb 05 '19 at 17:31
  • To format as `code`, select the text and press `ctrl-k` or add 4 spaces before each line – Juan Carlos Oropeza Feb 05 '19 at 18:09

2 Answers2

0

what about:

 DELETE FROM CTE 
 WHERE RN<>1
   AND ID1 IS NOT NULL
Juan Carlos Oropeza
  • 47,252
  • 12
  • 78
  • 118
0

use ID2,and Data in partition by

   WITH CTE AS (
        SELECT f.*, ROW_NUMBER() OVER (PARTITION BY  ID2,data ORDER BY Timestamp ) AS RN
        FROM Filings_Search 
    )
    DELETE FROM CTE WHERE RN<>1
Zaynul Abadin Tuhin
  • 31,407
  • 5
  • 33
  • 63