1

I have a table table_1 in Postgres with three columns based on a Django model, ID (auto incrementing value), affiliation, and entry.

What I need to do is take every entry where affiliation = 52, and then add an additional row with entry = query.entry and affiliation = 48. I need additional rows, not changed values. I also need to make sure that the entry + affiliation pair does not exist already.

My latest attempt:

do $$
declare
id_array int[];
begin
insert into id_array table_1.entry where table_1.affiliation = 52;
for id in id_array
loop
if not exists (select entry from table_1 where affiliation = 48)
then 
insert into table_1 (affiliation, entry) values (48, id);
else
raise notice "already exists";
end if;
endloop;
end; $$

I still get a syntax error! Any ideas? I'm at a complete loss. This is by far the most complex query I've ever tried.

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • Add the syntax error to your question. Though I'm going to say the issue is here: `insert into id_array ...`. If I'm following correctly that probably should be: `select into id_array array_agg(entry) from table1 where affiliation = 52;` – Adrian Klaver Sep 02 '21 at 23:26

1 Answers1

1

You don't need a (clumsy and inefficient, and incorrect) loop for this.

You should have a UNIQUE constraint on (affiliation, entry) to begin with.

With UNIQUE index

With a UNIQUE constraint or index on (affiliation, entry) just use an "UPSERT":

INSERT INTO table_1 (affiliation, entry)
SELECT 48, entry
FROM   table_1
WHERE  affiliation = 52
ON     CONFLICT DO NOTHING;

ON CONFLICT DO NOTHING skips any would-be duplicate. Job done.

This is safe against race conditions under concurrent write load. See:

Without UNIQUE index

SELECT DISTINCT 48, t1.entry
FROM   table_1 t1
LEFT   JOIN table_1 t2 ON t2.entry = t1.entry
                      AND t2.affiliation = 48
WHERE  t1.affiliation = 52
AND    t2.entry IS NULL;  -- does not exist yet

I added DISTINCT as without said UNIQUE index, there might be duplicates on (affiliation, entry) to begin with, and we don't want more than one copy.

Not safe against race conditions under concurrent write load.

See:

Both queries assume entry to be defined NOT NULL. Else you need to define how to deal with NULL values.

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228