0

I would like to number all rows of my table according to how many times a given row has occured in the table. I tried to define a table of my columns as follows:

select COLUMN_NAME as cl from information_schema.columns where table_name = 'TEST'

Then I would like to run the following query.

SELECT *,
ROW_NUMBER() OVER(PARTITION BY Cols IN cl ORDER BY (SELECT 0)) AS rn
FROM TEST 

The second query works nicely if I manually specify the column names for partition. But otherwise it fails. Is there a way of doing this? The motivation behind all this is that I have a table without an ID, where there can be row duplicates which I want to get rid of. The second query is inspired by answers to this question. I use MS SQL 2008.

Sample data before:

| Column1 | Column2 |  Column3   |
|---------|---------|------------|
| aaaa    |    1111 | 23.04.2018 |
| bbbb    |    2222 | 24.04.2018 |
| cccc    |    3333 | 25.04.2018 |
| dddd    |    4444 | 26.04.2018 |
| bbbb    |    4445 | 27.04.2018 |
| aaaa    |    1111 | 23.04.2018 |
| aaaa    |    1234 | 23.04.2018 |

Sample data after:

| Column1 | Column2 |  Column3   |
|---------|---------|------------|
| aaaa    |    1111 | 23.04.2018 |
| bbbb    |    2222 | 24.04.2018 |
| cccc    |    3333 | 25.04.2018 |
| dddd    |    4444 | 26.04.2018 |
| bbbb    |    4445 | 27.04.2018 |
| aaaa    |    1234 | 23.04.2018 |
JAV
  • 279
  • 2
  • 9
  • 1
    Sample data and desired results would really help. – Gordon Linoff Apr 24 '18 at 11:23
  • Is your goal to identify rows that are completely identical (same value in all columns)? – Dan Guzman Apr 24 '18 at 11:24
  • Yes my goal is to identify rows that are completely identical. I have to import and update data from Excel and there is no ID in it. To be able to to keep the table in the SQL database without duplicates, I would like to run something that cleans the table from perfect duplicates. – JAV Apr 24 '18 at 11:27
  • 2
    Uhm, to not select the duplicates, then you could just do a [SELECT DISTINCT](https://www.w3schools.com/sql/sql_distinct.asp)? – LukStorms Apr 24 '18 at 11:38
  • Sweet it seems to work. So it should be enough to write SELECT DISTINCT * TEST? – JAV Apr 24 '18 at 11:41
  • Yes, the DISTINCT will get unique records over ALL the fields you select. You could also do this via row_number, but you'd have to put it in a sub-query and then select those where RN = 1. – LukStorms Apr 24 '18 at 11:47

2 Answers2

2

You can define a second table and insert a simple SELECT DISTINCT *. Try it out:

Hint: I use declared table variables here, but you will need real tables for this.

SET DATEFORMAT DMY;

DECLARE @tbl TABLE(Column1 VARCHAR(100),Column2 INT,Column3 DATE)
INSERT INTO @tbl VALUES
 ('aaaa',1111,'23.04.2018')
,('bbbb',2222,'24.04.2018')
,('cccc',3333,'25.04.2018')
,('dddd',4444,'26.04.2018')
,('bbbb',4445,'27.04.2018')
,('aaaa',1111,'23.04.2018')
,('aaaa',1234,'23.04.2018');

--define a new table with the same structure
--Make sure, that the column order is absolutely the same as the origin!
DECLARE @tbl2 TABLE(Column1 VARCHAR(100),Column2 INT,Column3 DATE)

INSERT INTO @tbl2
SELECT DISTINCT * FROM @tbl;

--Check the result
SELECT * FROM @tbl2;

Alternatively you might use this syntax to create a physical table on the fly

SELECT DISTINCT * INTO dbo.NewTable FROM @tbl;

Attention: Do not use culture dependant date formats...

Shnugo
  • 66,100
  • 9
  • 53
  • 114
0

In your case the DISTINCT is enough.

SELECT DISTINCT Column1, Column2, Column3
FROM TEST

If you have any different values in Column3, then you write something below:

SELECT Column1, Column2, Column3
FROM(
    SELECT *,
         ROW_NUMBER() OVER(PARTITION BY Col1,Col2 ORDER BY Column3) AS RN
    FROM TEST
    )D
WHERE RN=1
DineshDB
  • 5,998
  • 7
  • 33
  • 49