9

In my Rails 4 app, I have this query to a Postgres 9.4 database:

@chosen_opportunity = Opportunity.find_by_sql(
  " UPDATE \"opportunities\" s
    SET opportunity_available = false
    FROM (
          SELECT \"opportunities\".*
          FROM   \"opportunities\"
          WHERE  ( deal_id = #{@deal.id}
          AND    opportunity_available = true 
          AND    pg_try_advisory_xact_lock(id) )
          LIMIT  1
          FOR    UPDATE
          ) sub
    WHERE       s.id = sub.id
    RETURNING   sub.prize_id, sub.id"
) 

Very much inspired by this related answer on dba.SE.

I just want my query to find and update the first (randomly, with LIMIT) row where available = true and update it to available = false, and I need to lock the row while doing this, but without making new requests waiting for the release of the previous lock as there are many concurrent calls that will use this query.

But I also saw the NOWAIT option to FOR UPDATE. I'm not sure I understand the difference between using pg_try_advisory_xact_lock() and the NOWAIT option, they seem to me to achieve the same goal:

  • NOWAIT's goal:

    To prevent the operation from waiting for other transactions to commit, use the NOWAIT option.

  • pg_try_advisory_xact_lock's goal
    is not to wait for the previous transaction to release the lock and still be able to do another transaction and only operate the next select for update the 'not yet locked' rows.

Which one is better suited to my need?

Community
  • 1
  • 1
Mathieu
  • 4,587
  • 11
  • 57
  • 112

1 Answers1

16

FOR UPDATE NOWAIT is only a good idea if you insist on locking a particular row, which is not what you need. You just want any qualifying, available (unlocked) row. The important difference is, (quoting the manual:)

With NOWAIT, the statement reports an error, rather than waiting, if a selected row cannot be locked immediately.

Identical queries will very likely try to lock the same arbitrary pick. FOR UPDATE NOWAIT will just bail out with an exception (which will roll back the whole transaction unless you trap the error) and you have to retry.

The solution in my referenced answer on dba.SE uses a combination of plain FOR UPDATE in combination with pg_try_advisory_lock():

pg_try_advisory_lock is similar to pg_advisory_lock, except the function will not wait for the lock to become available. It will either obtain the lock immediately and return true, or return false if the lock cannot be acquired immediately.

So your best option is ... a third alternative: FOR UPDATE SKIP LOCKED (added with Postgres 9.5), which implements the same behavior without additional function call.

The manual explains the difference:

To prevent the operation from waiting for other transactions to commit, use either the NOWAIT or SKIP LOCKED option. With NOWAIT, the statement reports an error, rather than waiting, if a selected row cannot be locked immediately. With SKIP LOCKED, any selected rows that cannot be immediately locked are skipped.

With Postgres 9.4 or older your next best option is to use pg_try_advisory_xact_lock(id) in combination with FOR UPDATE like demonstrated in the referenced answer:

(Also with an implementation with FOR UPDATE SKIP LOCKED.)

Aside

Strictly speaking you get arbitrary, not truly random picks. That can be an important distinction.
An audited version of your query is in my answer to your other question.

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • thanks so much for your detailed answer. i'll try to implement next monday. yes you're right update skip locked is the best suited. i might wait fir its release and in meantime use pg advisory lock. as my rows are already distributed randomly at that stage it does not matter if i din't get arbitrary pick. – Mathieu Oct 16 '15 at 07:02
  • @Mathieu: "Arbitrary" also matters because it implies that Postgres will typically pick the *same* row for the same query *arbitrarily*, which makes lock contention a much bigger issue than truly random picks would. – Erwin Brandstetter Oct 16 '15 at 07:07
  • what do you mean 'it will pick always the same row'? my app is a sort of lottery if you want. concurrent calls by players aremade and when a player checks an 'ticket=opportunity' then i update it to available = false do when the next shot of a player comes it will use the query we're talling about and choose any row where opportunity_available= true. so if the query always picks the same row, as it will be updated to not available as soon as opened by a player, i don't think this arbitrary aspect will impact me. do you? postgresql rookie here – Mathieu Oct 16 '15 at 07:16
  • @Mathieu: If a query does not specify `ORDER BY`, Postgres is free to chose the most efficient way to return qualifying rows in any order. Typically that's a sequential scan. So the order is not *random* but *arbitrary*: the same query will probably (but nor necessarily) return the same rows in the same order, even without `ORDER BY`. Unfortunate for a queuing system like yours - but typically still faster than imposing random order. The current physical position of rows in a table can be found in the [system column `ctid`](http://dba.stackexchange.com/a/66007/3684) and can change any time. – Erwin Brandstetter Oct 16 '15 at 19:09
  • As I already have the 10 "prizes" randomly distributed among say 1000 opportunities, I don't think it actually matters for me that postgresql arbitrarily choose the row. as long as it chooses "arbitrarily" a row complying with the select query (a row where available=true) the app will work. in any case, thanks for the link to ctid, very interesting for future endeavours. – Mathieu Oct 17 '15 at 06:24
  • I am not saying it would impair randomness of your sample. If the order in the table is random, arbitrary picks are good enough. It just increases lock contention. – Erwin Brandstetter Oct 17 '15 at 07:01
  • ha ok got it. for a 'lottery-like' system (each player call a db to see if what he clicked goes to select available opportunities = true then tells him if it's win or lose), do you think another db type would be better fit to avoid lock contention and allow very high nb of concurrent calls ? if postgresql is right fit, where would you look for optimization like advanced tactics? – Mathieu Oct 17 '15 at 07:35
  • Test whether lock contention is an actual problem in your use case. It probably isn't. If it is then, depending on the complete picture, a cheap random selection might be an option: http://stackoverflow.com/a/8675160/939860. Or start a new question with all the necessary details. – Erwin Brandstetter Oct 17 '15 at 21:16
  • I'm at a loss, struggling with the query so I opened a new question: http://stackoverflow.com/questions/33330915/postgresql-9-4-prevent-app-selecting-always-the-latest-updated-rows – Mathieu Oct 25 '15 at 14:42
  • moved to another question : I think the problem comes form the fast, (and you spotted it in your intuition here quite fats hehe) that I need to move away from ARBITRARY pick to truly RANDOM pick while having the fastest select/update query possible and locks: http://stackoverflow.com/questions/33333021/postgresql-9-4-fastest-query-to-select-and-update-on-large-dataset-30m-rows – Mathieu Oct 25 '15 at 18:08