Removing duplicates rows based off one column SQL

Question

I need to clear some partial duplicate rows. In this case, the columns InvoiceTaskID have a few duplicate values and I want to clear them.

I have tried partitioning based of some of the solutions here but I keep getting row number = 1 for every single row when I execute.

I don't have a primary key to work with

Note: the column dbo.D.duplicate is a count of how many times the duplicates in appear in the column InvoiceTaskID

Does anyone have a solutions?

 select    dbo.A.TaskId,
           dbo.A.InvoiceTaskId,
           dbo.A.Client,
           dbo.Acd.InvoiceId,
           dbo.A.Date,
           dbo.A.JobId,
           dbo.A.TaskName,
           dbo.A.StaffName,
           dbo.A.BillableRate,
           dbo.A.Hrs,
           dbo.A.Amount,
           dbo.D.duplicate,
    
           ROW_NUMBER() over (partition by dbo.A.TaskId,
           dbo.A.InvoiceTaskId,
           dbo.A.Client,
           dbo.A.InvoiceId,
           dbo.A.Date,
           dbo.A.JobId,
           dbo.A.TaskName,
           dbo.A.StaffName,
           dbo.A.BillableRate,
           dbo.A.Hrs,
           dbo.A.Amount,
           dbo.D.duplicate
    order by dbo.A.TaskId,
           dbo.A.InvoiceTaskId,
           dbo.A.Client,
           dbo.A.InvoiceId,
           dbo.A.Date,
           dbo.A.JobId,
           dbo.A.TaskName,
           dbo.A.StaffName,
           dbo.A.BillableRate,
           dbo.A.Hrs,
           dbo.A.Amount,
           dbo.D.duplicate) RowNumber
    from dbo.A
    INNER JOIN dbo.D ON dbo.D.TaskNo = dbo.A.InvoiceTaskId

That query with the formatting looks *very* odd. It looks like the `ORDER BY` is *before* your `FROM`. — Thom A, Aug 25 '23 at 14:11
FYI [3+ part naming on Columns will be Deprecated](https://wp.larnu.uk/3-part-naming-on-columns-will-be-deprecated/) and should be avoided. You are far better off aliasing your objects in the `FROM` and using those to qualify your columns. This avoids the use of a feature that will be deprecated and possibly removed in a future version, and makes your code much more succinct and readable. — Thom A, Aug 25 '23 at 14:12
Does this answer your question? [How to delete duplicate rows in SQL Server?](https://stackoverflow.com/questions/18390574/how-to-delete-duplicate-rows-in-sql-server) — Thom A, Aug 25 '23 at 14:13
If you are trying to identify duplicates using only a single column, then only that column should appear in your `PARTITION BY` for your `ROW_NUMBER()` windowing function. Remove the rest of that stuff. Essentially you want a single multi-row partition (grouping of records) for every distinct InvoiceTaskID, and you want the rest of these columns to order the rows in that partition to number with the `ROW_NUMBER()` logic. — JNevill, Aug 25 '23 at 14:14
Does this your question? [Get top 1 row of each group](https://stackoverflow.com/questions/6841605/get-top-1-row-of-each-group) — Thom A, Aug 25 '23 at 14:14
`ROW_NUMBER() over (partition by dbo.A.InvoiceTaskId order by ....` — Charlieface, Aug 25 '23 at 14:24

score -1 · Answer 1 · answered Aug 25 '23 at 18:06

You can delete the records you obtained using WITH your query.

WITH RECORDS AS (
select    dbo.A.TaskId,
           dbo.A.InvoiceTaskId,
           dbo.A.Client,
           dbo.Acd.InvoiceId,
           dbo.A.Date,
           dbo.A.JobId,
           dbo.A.TaskName,
           dbo.A.StaffName,
           dbo.A.BillableRate,
           dbo.A.Hrs,
           dbo.A.Amount,
           dbo.D.duplicate,
    
           ROW_NUMBER() over (partition by dbo.A.TaskId,
           dbo.A.InvoiceTaskId,
           dbo.A.Client,
           dbo.A.InvoiceId,
           dbo.A.Date,
           dbo.A.JobId,
           dbo.A.TaskName,
           dbo.A.StaffName,
           dbo.A.BillableRate,
           dbo.A.Hrs,
           dbo.A.Amount,
           dbo.D.duplicate
    order by dbo.A.TaskId,
           dbo.A.InvoiceTaskId,
           dbo.A.Client,
           dbo.A.InvoiceId,
           dbo.A.Date,
           dbo.A.JobId,
           dbo.A.TaskName,
           dbo.A.StaffName,
           dbo.A.BillableRate,
           dbo.A.Hrs,
           dbo.A.Amount,
           dbo.D.duplicate) RowNumber
    from dbo.A
    INNER JOIN dbo.D ON dbo.D.TaskNo = dbo.A.InvoiceTaskId
) DELETE FROM RECORDS WHERE RowNumber > 1

Removing duplicates rows based off one column SQL

1 Answers1