1

This code below works with SELECT statement:

WITH 
  smaller_uuid AS (
    SELECT
      id, uuid, email, first_name, last_name, display_name
    FROM stack_users_production.users AS user1
    WHERE EXISTS (
      SELECT
        id, uuid, email
      FROM
        stack_users_production.users AS user2
      WHERE
        user1.id = user2.id AND
        user1.email = user2.email AND
        user1.uuid < user2.uuid
    )
  )

SELECT 
  id, uuid, email, first_name, last_name, display_name
FROM 
  stack_users_production.users
WHERE
  uuid IN (SELECT uuid FROM smaller_uuid);

But this code below does not work with DELETE statement:

WITH 
  smaller_uuid AS (
    SELECT
      id, uuid, email, first_name, last_name, display_name
    FROM stack_users_production.users AS user1
     WHERE EXISTS (
      SELECT
        id, uuid, email
      FROM
        stack_users_production.users AS user2
      WHERE
        user1.id = user2.id AND
        user1.email = user2.email AND
        user1.uuid < user2.uuid
    )
  )

DELETE FROM 
  stack_users_production.users
WHERE
  uuid IN (SELECT uuid FROM smaller_uuid);

It says syntax error:

psql:snippets.pgsql:113: ERROR:  syntax error at or near "DELETE"
LINE 18: DELETE FROM

I am using PostgreSQL 9.6.3, and is connected to AWS Redshift when executing this query.

Kara
  • 6,115
  • 16
  • 50
  • 57
Christian Sakai
  • 929
  • 3
  • 9
  • 25

1 Answers1

5

Redshift is not PostgreSQL, it's a fork based on (ancient!) Postgres 8.0 and developed separately since. This is a misunderstanding:

I am using PostgreSQL 9.6.3, and is connected to AWS Redshift when executing this query.

You are using psql (the Postgres command line interface), that much is clear from the error message. But you are accessing a Redshift database, which means you are not, in fact, using PostgreSQL 9.6.3 at all.

I edited your question to clarify.

The list of Unsupported PostgreSQL Features is long - and incomplete. Among other things, data-modifying CTEs are not supported - or any CTEs in write operations. Documented on this page: Features That Are Implemented Differently:

INSERT, UPDATE, and DELETE
WITH is not supported.

That's why the SELECT works, but the DELETE does not.

While being at it, Amazon also warns:

Only the 8.x version of the PostgreSQL query tool psql is supported.

Using psql shipped with Postgres 9.6 may have added to the confusion.

Your query would work just fine in PostgreSQL - even though you might radically simplify:

DELETE FROM stack_users_production.users u
WHERE  EXISTS (
   SELECT 1
   FROM   stack_users_production.users
   WHERE  id    = u.id
   AND    email = u.email
   AND    uuid  > u.uuid
   );

And that might just work on Redshift, too.

However, be aware that this DELETE will not necessarily make (id, email) unique. There may be multiple rows with the same uuid - unless you know that cannot happen. Else you need a subquery with DISTINCT ON or row_number() (implemented in Redshift, too) to guarantee a single row with the "greatest" uuid per (id, email) as survivor.

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228