Removing "duplicate" records from SQL Query

Question

I'm trying to write a query that will remove a "duplicate" value if it's before a change in the value. For example, I have the following data:

id, date_recived, name1, percent
1, 8/19/13, Joe, 2
2, 8/15/13, Joe, 3
3, 8/12/13, Joe, 2
4, 4/08/13, Joe, 2

In this example I need to remove only the row with id 3 because there was no change in the percent. I've tried using Row_Number, however when I partition by the name and percent the data appears as follows:

id, date_recived, name1, percent, Row_Number
1, 8/19/13, Joe, 2, 3
2, 8/15/13, Joe, 3, 1
3, 8/12/13, Joe, 2, 2
4, 4/08/13, Joe, 2, 1

This won't work because 1 would be thrown out and I need to keep it. Any help would be appreciated.

Thanks,

EDIT:

The query is (from the comment):

WITH feedback_On_Same_Date AS (
      SELECT id, received_dt, participant_nm, pretax_deferral_prcnt,
             ROW_NUMBER() OVER(PARTITION BY participant_nm, received_dt ORDER BY received_dt DESC, id DESC) AS RowNumberByDt
      FROM be_ing_feedback
     )
SELECT *, ROW_NUMBER() OVER(PARTITION BY participant_nm, pretax_deferral_prcnt ORDER BY received_dt ASC, id ASC) AS RowNumberByDef
FROM feedback_On_Same_Date
WHERE RowNumberByDt = 1
ORDER BY participant_nm ASC, received_dt DESC

`WITH feedback_On_Same_Date AS ( SELECT id, received_dt, participant_nm, pretax_deferral_prcnt, ROW_NUMBER() OVER(PARTITION BY participant_nm, received_dt ORDER BY received_dt DESC, id DESC) AS RowNumberByDt FROM be_ing_feedback ) SELECT *, ROW_NUMBER() OVER(PARTITION BY participant_nm, pretax_deferral_prcnt ORDER BY received_dt ASC, id ASC) AS RowNumberByDef FROM feedback_On_Same_Date WHERE RowNumberByDt = 1 ORDER BY participant_nm ASC, received_dt DESC` — Spoonman, Feb 06 '14 at 14:34

score 1 · Accepted Answer · answered Feb 06 '14 at 14:28

It sounds like you want to delete rows where the "previous" one is identical.

To do this I would use the Row_Number() function and a self join:

; WITH cte AS (
  SELECT id
       , date_recived
       , name1
       , percent
       , Row_Number() OVER (PARTITION BY name ORDER BY date_recived) As seq
  FROM   your_table
)
SELECT *
FROM   cte As curr
 LEFT
  JOIN cte As prev
    ON prev.name1   = curr.name1
   AND prev.seq + 1 = curr.seq

Have a look at the data returned by this query and then it should just be a simple case of adding a WHERE clause to get the records you want to delete.

I'll leave the rest of the puzzle up to you ;-)

score 0 · Answer 2 · answered Feb 06 '14 at 14:42

0

If you are using SQL Server 2012, just use the lag() function:

select id, date_recived, name1, percent
from (select t.*,
             lag(percent) over (partition by name1 order by date_received) as prev_percent
      from t
     ) t
where percent <> prev_percent or prev_percent is null;

answered Feb 06 '14 at 14:42

Gordon Linoff

1,242,037
58
646
786

That would be awesome, and would appear to work. Too bad I'm only using 2008. Thank you kindly though. – Spoonman Feb 06 '14 at 14:51

Removing "duplicate" records from SQL Query

2 Answers2