14

I have the following query, which I use with postgres 9.5:

INSERT INTO knowledge_state 
(SELECT learnerid learner_id, lo_id FROM qb_lo_tag WHERE qb_id = NEW.qb_id)
ON CONFLICT DO NOTHING ;

Unfortunately I can't use postgres 9.5 on some servers, and I need to convert it to a pre - 9.5 friendly query. I have built the following query instead, but it seems much more complicated to me, and I thought something simpler might be possible..

FOR rows IN SELECT lo_id FROM knowledge_state 
WHERE learner_id = learnerid 
AND lo_id IN (SELECT lo_id FROM qb_lo_tags WHERE qb_id = New.qb_id) LOOP

  INSERT INTO knowledge_state (lo_id, learner_id) SELECT rows.lo_id, learnerid 
WHERE NOT EXISTS (SELECT * FROM knowledge_state WHERE lo_id = rows.lo_id AND learner_id = learnerid);

END LOOP;

I would love to hear ideas on how to simplify this query.

Clémentine
  • 468
  • 1
  • 5
  • 16

1 Answers1

5

Just do what you're doing, without the loop:

INSERT INTO knowledge_state (lo_id, learner_id) 
SELECT  a.lo_id, a.learnerid
FROM qb_lo_tag a
WHERE a.qb_id = NEW.qb_id
and  NOT EXISTS (SELECT * FROM knowledge_state b 
     WHERE b.lo_id = a.lo_id AND b.learner_id = a.learnerid);

Of course, you can add an index on knowledge_state (lo_id, learner_id) to make it faster (On Conflict implies a unique constraint or other constraint, and a unique constraint implies an index).

tpdi
  • 34,554
  • 11
  • 80
  • 120
  • Wrong. This is incorrect in the presence of concurrent transactions; it may still raise a duplicate key error if two are run at the same time. You must `LOCK TABLE` first or use a `LOOP` in plpgsql to trap the duplicate key exception and retry. It's the same problem as an upsert really. – Craig Ringer Jan 18 '16 at 04:52
  • @CraigRinger is my version of the query exempt from the problem you describe above? – Clémentine Jan 18 '16 at 10:01
  • 3
    @Clémentine No. Imagine if two concurrent sessions both did the `INSERT ... WHERE NOT EXISTS (SELECT ...)` *at the same time*. The `WHERE NOT EXISTS` would be *true* for *both* of them since neither would find a row. So both would proceed to `INSERT`. One would fail. This is a consequence of transaction visibility rules, specifically prevention of dirty reads, combined with the fact that unique indexes partially violate transaction isolation in order to ensure uniqueness. You *must* either **lock the table** or **use a plpgsql loop that catches unique_violation and retries**. – Craig Ringer Jan 18 '16 at 11:22