Create a table with duplicate values, and use a CTE (Common Table Expression )to delete those duplicate values

Question

Create a table with duplicate values, and use a CTE (Common Table Expression )to delete those duplicate values.

=> Would some one please help me how to start it because i really don't understand the question. Assume guess duplicate values can be chosen anything.

A CTE does not delete anything. I don't understand the question either. Creating a table with duplicated values is easy: `create table tbl as select 1 col1 from dual union all select 1 from dual` creates a table `tbl` with one column `col1` and two rows, both with the value 1 in that column. — , Jun 23 '17 at 22:52
This is like a question fragment. Does it have more that goes with it? — SandPiper, Jun 23 '17 at 23:09
Looks similar to this: https://stackoverflow.com/questions/44730226/deleting-duplicate-row-that-has-earliest-date/44730657#44730657. Not sure the CTE's needed, but I suspect you could replace the subquery with a CTE. — JohnLBevan, Jun 23 '17 at 23:18

JohnLBevan · Answer 1 · 2017-06-23T23:32:23.417

1

For MS SQL Server, this would work:

;with cte as
(
    select *
    , row_number() over (
        partition by [columns], [which], [should], [be], [unique]
        order by [columns], [to], [select], [what's], [kept]
    ) NoOfThisDuplicate
)
delete 
from cte
where NoOfThisDuplicate > 1

SQL Fiddle Demo (based on this question: Deleting duplicate row that has earliest date).

Explanation

Create a CTE
Populate it with all rows from the table we want to delete
Add a NoOfThiDuplicate column to that output
Populate this value with the sequential number of this record with the group/partition of all records with the same values for columns [columns], [which], [should], [be], [unique].
The order of the numbering depends on the sort order of those records when sorted by columns [columns], [to], [select], [what's], [kept]
We delete all records returned by the CTE except the first of each group (i.e. all except those with NoOfThisDuplicate=1).

edited Jun 23 '17 at 23:32

answered Jun 23 '17 at 23:26

JohnLBevan

22,735
13
96
178

Thank you your for clear explanation. Would you please help me how to learn sql for beginners. do you any links to follow up sir @JohnLBevan – randy Jun 24 '17 at 09:23
@Randy a good place to start is https://www.w3schools.com/Sql/. Also see https://www.slant.co/topics/890/~best-online-resources-for-learning-sql – JohnLBevan Jun 24 '17 at 09:45
Thank you for your reply. How about watching videos? I watched it on youtube but different answers. will be vice to have smooth series of video with clear and easy understanding. – randy Jun 24 '17 at 10:00
@Randy; sure; use whatever works best for you. I'm not familiar with any SQL Server tutorial video series myself, but I'm sure there are some out there; ultimately browse, try things out & decide if they work for you / if you feel you're progressing. If you are, stick with it, if not, browse for something else. – JohnLBevan Jun 24 '17 at 10:28

score 0 · Answer 2 · answered Jun 23 '17 at 23:21

Oracle Setup:

CREATE TABLE test_data ( value ) AS
  SELECT LEVEL   FROM DUAL CONNECT BY LEVEL <= 10
UNION ALL
  SELECT 2*LEVEL FROM DUAL CONNECT BY LEVEL <= 5;

Query 1:

This will select the values removing duplicates:

SELECT DISTINCT *
FROM   test_data

But it does not use a CTE.

Query 2:

So, we can put it in a sub-query factoring clause (the name used in the Oracle documentation which corresponds to the SQL Server Common Table Expression)

WITH unique_values ( value ) AS (
  SELECT DISTINCT *
  FROM   test_data
)
SELECT * FROM unique_values;

Query 3:

The sub-query factoring clause was pointless in the previous example ... so doing it a different way:

WITH row_numbers ( value, rn ) AS (
  SELECT value, ROW_NUMBER() OVER ( PARTITION BY value ORDER BY ROWNUM ) AS rn
  FROM   test_data
)
SELECT value
FROM   row_numbers
WHERE  rn = 1;

Will select the rows where it the first instance of each value found.

Delete Query:

But that didn't delete the rows ...

DELETE FROM test_data
WHERE  ROWID IN (
  WITH row_numbers ( rid, rn ) AS (
    SELECT ROWID, ROW_NUMBER() OVER ( PARTITION BY value ORDER BY ROWNUM ) AS rn
    FROM   test_data
  )
  SELECT rid
  FROM   row_numbers
  WHERE  rn > 1
);

Which uses the ROWID pseudocolumn to match rows for deletion.

Create a table with duplicate values, and use a CTE (Common Table Expression )to delete those duplicate values

2 Answers2