16

I am using postgreSQL 9.1 and I want to delete duplicates from my table using this tip: https://stackoverflow.com/a/3822833/2239537

So, my query looks like that:

WITH cte
 AS (SELECT ROW_NUMBER() 
 OVER (PARTITION BY code, card_id, parent_id 
     ORDER BY id DESC) RN
     FROM card)
DELETE FROM cte
WHERE RN > 1

But it shows me

ERROR: relation "cte" does not exist
SQL state: 42P01
Character: 157

However this statement works fine:

WITH cte
 AS (SELECT ROW_NUMBER() 
 OVER (PARTITION BY code, card_id, parent_id 
     ORDER BY id DESC) RN
     FROM merchantcard)
SELECT * FROM cte
WHERE RN > 1

Any ideas how to get it work? Thanks!

Community
  • 1
  • 1
Alex Kartishev
  • 1,836
  • 3
  • 18
  • 25

3 Answers3

40

that's because CTE in PostgreSQL works differently than CTE in SQL Server. In SQL Server CTE are like an updatable views, so you can delete from them or update them, in PostgreSQL you cannot.

you can join cte and delete, like:

with cte as (
    select
        id,
        row_number() over(partition by code, card_id, parent_id order by id desc) as rn
    from card
)
delete
from card
where id in (select id from cte where rn > 1)

On the other hand, you can write DDL statements inside CTE in PostgreSQL (see documentation) and this could be very handy. For example, you can delete all rows from card and then insert only those having row_number = 1:

with cte1 as (
    delete
    from card
    returning *
), cte2 as (
    select
        row_number() over(partition by code, card_id, parent_id order by id desc) as rn,
        *
    from cte1
)
insert into card
select <columns here>
from cte2
where rn = 1
Alex Kartishev
  • 1,836
  • 3
  • 18
  • 25
Roman Pekar
  • 107,110
  • 28
  • 195
  • 197
  • 1
    Wow, thanks for the quick reply and useful answer! Worked for me, only had to add "delete FROM card", but this is just typo, of course. – Alex Kartishev Aug 26 '13 at 07:58
  • 1
    The first solution appears to delete all rows in the table for me. I don't understand cte behavior: if I do `select count(1) from cte where rn > 1` I get the correct number, but `select count(1) from card where id in (select id from cte where rn > 1)` returns all rows – Arkadiy Kukarkin Nov 11 '15 at 21:42
  • 1
    @ArkadiyKukarkin could it be that `id` is not unique? to help you it's better to have more information about your structure and result you want to get – Roman Pekar Nov 12 '15 at 08:26
  • 1
    upvoted! does this work with multiple columns as well i have a case, delete from table where (a,b) in (select a,b from cte) a,b form a composite key in my table – PirateApp May 28 '18 at 16:41
  • 1
    This works to dedupe a table in Postgres. Wow, I made my way here from trying to delete full duplicate rows in Postgres and was about to give up, then found this gold that combined other stuff I found in a way that works. Thanks! – liquid_diamond Feb 14 '22 at 23:23
10

I know, you are asking how you can solve your problem using the WITH statement, and got a good answer already. But I suggest looking at alternatives in the same question you linked.

What about this one?

DELETE FROM card
WHERE id NOT IN (
  SELECT MIN(id) FROM card
  GROUP BY code, card_id, parent_id 
);
maf-soft
  • 2,335
  • 3
  • 26
  • 49
  • 2
    This is the clearest way I've seen to detect and clean duplicate records in PostgreSQL. Thank you. – Carl Zulauf Oct 17 '14 at 23:08
  • 1
    So why do I get so many downvotes? It's not fair not to tell me. – maf-soft Jul 26 '17 at 11:38
  • 2
    probably because your solution requires that there already exists a unique key (id) which can be used in the where... the original question doesn't have one of those – Jamie Cook Nov 02 '17 at 06:32
2

For me it worked Like this in Postgres/GreenPlum :

delete
from card where id in (
with cte as (
    select
        id,
        row_number() over(partition by code, card_id, parent_id order by id desc) as rn
    from card
)
select id from cte where rn > 1);