How to delete the duplicates from the table using row_number() in psql?

Question

I have written the following query, i am using greenplum db and dbeaver for implementation.

with cte as 
(select 
*, 
row_number() over(partition by first_name order by roll_num) row_num 
from table_name 
where roll_num in ('0011')) 
delete from cte where row_num>1;

The above query is returning error. Can someone help me here, please!

If you are restricting the `roll_num` to only the single value `0011`, then I don't see the point of using `ROW_NUMBER` as you have done. Can you explain your logic? — Tim Biegeleisen, Oct 20 '20 at 03:04
Take roll_num as test case, the actual attribute is a string here. — Rahul Kumar, Oct 20 '20 at 03:08
One question, can we use where inside with and then perform delete query like above? — Rahul Kumar, Oct 20 '20 at 03:09
@TimBiegeleisen Even after removing the delete clause, the query still returns error — Rahul Kumar, Oct 20 '20 at 03:14
Does this answer your question? [Removing Duplicate Rows in PostgreSQL with multiple columns](https://stackoverflow.com/questions/51913783/removing-duplicate-rows-in-postgresql-with-multiple-columns) — Akhilesh Mishra, Oct 20 '20 at 03:17
No @AkhileshMishra, i am still having the same error. Can you please suggest me something else — Rahul Kumar, Oct 20 '20 at 03:23
@TimBiegeleisen , with cte as (select *, row_number() over(partition by first_name order by roll_num) row_num from table_name delete from cte where row_num>1; Can you help with this query, this is clear i guess — Rahul Kumar, Oct 20 '20 at 03:24

score 1 · Accepted Answer · answered Oct 20 '20 at 04:46

1

How about this:

PostgreSQL DELETE statement with USING clause

Ref: PostgreSQL Docs

answered Oct 20 '20 at 04:46

Craig Gers

544
3
9

1

Bro in your example, you have unique field id however i do not have such field in my case – Rahul Kumar Oct 20 '20 at 06:15
If that is the case then simply join the cte_subset on more than one fields, (e.g. first_name and roll_num) in the final where clause? The cte_subset must include these additional fields. – Craig Gers Oct 20 '20 at 09:10

nirav kanani · Answer 2 · 2020-10-20T04:46:36.930

0

Please try like this it works for me:

select *  from tablename as t 
where exists 
      ( select * 
        from tablename as d 
        where d.ctid > t.ctid 
          and d.* is not distinct from t.*
      ) ;

delete from tablename as t 
    where exists 
          ( select * 
            from tablename as d 
            where d.ctid > t.ctid 
              and d.* is not distinct from t.*
          ) ;

edited Oct 20 '20 at 04:46

answered Oct 20 '20 at 04:40

nirav kanani

47
8

i tired that already, the problem is that i have some fields which have null values and either they have min or max ctid in the table. So after removing duplicates, those kind of row remains which have some fields as null but i do want that. That's why i thought to use row_number() as non null fields always gets min row and the row which does not have numm values remain at the end – Rahul Kumar Oct 20 '20 at 05:46

How to delete the duplicates from the table using row_number() in psql?

2 Answers2