3

I have a simple PL/PGSQL block Postgres 9.5 that loops over records in a table and conditionally updates some of the records.

Here's a simplified example:

DO $$
  DECLARE

    -- Define a cursor to loop through records
    my_cool_cursor CURSOR FOR
    SELECT
      u.id          AS user_id,
      u.name        AS user_name,
      u.email       AS user_email
    FROM users u
    ;

  BEGIN

    FOR record IN my_cool_cursor LOOP

      -- Simplified example: 
      -- If user's first name is 'Anjali', set email to NULL
      IF record.user_name = 'Anjali' THEN
        BEGIN
          UPDATE users SET email = NULL WHERE id = record.user_id;
        END;
      END IF;

    END LOOP;
  END;

$$ LANGUAGE plpgsql;

I'd like to execute this block directly against my database (from my app, via the console, etc...). I do not want to create a FUNCTION() or stored procedure to do this operation.

The Issue

The issue is that the CURSOR and LOOP create a table-level lock on my users table, since everything between the outer BEGIN...END runs in a transaction. This blocks any other pending queries against it. If users is sufficiently large, this locks it up for several seconds or even minutes.

What I tried

I tried to COMMIT after each UPDATE so that it clears the transaction and the lock periodically. I was surprised to see this error message:

ERROR:  cannot begin/end transactions in PL/pgSQL
HINT:  Use a BEGIN block with an EXCEPTION clause instead.

I'm not quite sure how this is done. Is it asking me to raise an EXCEPTION to force a COMMIT? I tried reading the documentation on Trapping Errors but it only mentions ROLLBACK, so I don't see any way to COMMIT.

  1. How do I periodically COMMIT a transaction inside the LOOP above?
  2. More generally, is my approach even correct? Is there a better way to loop through records without locking up the table?
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
user2490003
  • 10,706
  • 17
  • 79
  • 155
  • The cursor will not block other (read-only) queries. But I don't see the need for a (slow and inefficient) cursor to begin with –  Oct 15 '19 at 15:47

2 Answers2

1

1.

You cannot COMMIT within a PostgreSQL function or DO command at all (plpgsql or any other PL). The error message you reported is to the point (as far as Postgres 9.5 is concerned):

ERROR:  cannot begin/end transactions in PL/pgSQL

A procedure could do that in Postgres 11 or later. See:

There are limited workarounds to achieve "autonomous transactions" in older versions:

But you do not need any of this for the presented case.

2.

Use a simple UPDATE instead:

UPDATE users
SET    email = NULL
WHERE  user_name = 'Anjali'
AND    email IS DISTINCT FROM NULL;  -- optional improvement

Only locks the rows that are actually updated (with corner case exceptions). And since this is much faster than a CURSOR over the whole table, the lock is also very brief.

The added AND email IS DISTINCT FROM NULL avoids empty updates. Related:

It's rare that explicit cursors are useful in plpgsql functions.

Community
  • 1
  • 1
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
1

If you want to avoid locking rows for a long time, you could also define a cursor WITH HOLD, for example using the DECLARE SQL statement.

Such cursors can be used across transaction boundaries, so you could COMMIT after a certain number of updates. The price you are paying is that the cursor has to be materialized on the database server.

Since you cannot use transaction statements in functions, you will either have to use a procedure or commit in your application code.

Laurenz Albe
  • 209,280
  • 17
  • 206
  • 263