6

I have the following UPDATE statement:

update mytable
   set a = first_part(genid()),
       b = second_part(genid()),
       c = third_path(genid())
 where package_id = 10;

In this example the function genid() is called three times for each row, which is wrong - I want it to be called only once for each row of mytable.

I'm using PostgreSQL 8.4 database. How to write the correct update?

I've tried something like this:

update mytable
   set a = first_part(g),
       b = second_part(g),
       c = third_path(g)
 where package_id = 10
  from genid() as g;

But it didn't work, because genid() has been called only once for the whole update statement.

Lukas Eder
  • 211,314
  • 129
  • 689
  • 1,509
Cezariusz
  • 463
  • 8
  • 15
  • Can't you put that in a stored procedure and first call `genid()`, store it in a variable and then use it in your update statement? – juergen d Dec 30 '11 at 08:17
  • you simplified too much the example, I ques you have gen_id(column) – Florin Ghita Dec 30 '11 at 08:42
  • @Florin No, genid takes no parameters, it generates a unique value on each call, based on an arbitrary sequence and random generator. That's why I want it to be called once for the each row of the mytable. – Cezariusz Dec 30 '11 at 09:59
  • Looks like there is already a working solution in the answers, but is defining an additional column for the full `genid` + a row-level trigger that updates a, b and c on insert/update of that column an option? – tscho Dec 30 '11 at 11:45
  • Additional column is not an option, we have about 2000 such tables, and the schema is rather constant. This column would be too much overhead for millions of records. – Cezariusz Dec 30 '11 at 12:55
  • @Cezariusz can you try with a cursor? as I suggested in my updated answer – Florin Ghita Dec 30 '11 at 14:11

3 Answers3

10

Have you tried Postgres' non-standard UPDATE .. FROM clause? I imagine, this would work

update mytable
   set a = first_part(gen.id),
       b = second_part(gen.id),
       c = third_path(gen.id)
  from (
          select genid() as genid, id
          from mytable 
          where package_id = 10
       ) gen
 where mytable.id = gen.id;
 --and package_id = 10 -- This predicate is no longer necessary as the subquery
                       -- already filters on package_id, as Erwin mentioned

Note that I'm forcing genid() to be called exactly once per record in mytable within the subselect. Then I'm self-joining mytable and gen using a hypothetical id column. See the documentation here:

http://www.postgresql.org/docs/current/interactive/sql-update.html

This seems to have been introduced with Postgres 9.0 only, though. If that seems too complicated (i.e. not very readable), you can still resort to pgplsql as user Florin suggested here.

Community
  • 1
  • 1
Lukas Eder
  • 211,314
  • 129
  • 689
  • 1,509
  • This works great, thanks. I have some performance concerns, though, because of the hash joins this involves. – Cezariusz Dec 30 '11 at 15:58
  • The second `where package_id = 10` is redundant. The subquery already filters (as it should). If `mytable.id` is indexed (as it should be), the redundant check is of no use. – Erwin Brandstetter Dec 30 '11 at 16:20
  • @Cezariusz: Have you analysed the execution plan? If you self-join on primary keys, the join should be fairly efficient. – Lukas Eder Dec 30 '11 at 16:44
  • @ErwinBrandstetter: You're right. I hadn't thought about it that way. I'll fix it – Lukas Eder Dec 30 '11 at 16:45
  • +1 Now I like your answer. :) You should probably link to the manual for version 8.4 (OP's version) or the current version. Preferably to the `/interactive` branch. [Consider this](http://meta.stackexchange.com/questions/108714/best-way-to-reference-the-postgresql-manual). – Erwin Brandstetter Dec 30 '11 at 16:57
  • @ErwinBrandstetter: Nice link, I wasn't aware of "/current" documentation sub-paths. They're usually not the ones found on google... :) btw: You can edit answers too, if you feel something should be improved (e.g. a link) – Lukas Eder Dec 30 '11 at 17:24
  • @LukasEder: When dealing with experienced users like you I only edit when I am 100% certain and rather default to a comment. The high reputation tells me you generally know what you are doing. – Erwin Brandstetter Dec 30 '11 at 17:31
0

Can you check if this is working in postgres?

update 
  (select 
     a, b, c, genid() as val
  from mytable
  where package_id = 10
  )
set a = first_part(val),
       b = second_part(val),
       c = third_path(val);

Update: just for the ideea: Can you try it with a cursor?

DECLARE c1 CURSOR FOR 
      select 
         a, b, c, genid() as val
      from mytable
      where package_id = 10; 
...
UPDATE table SET ... WHERE CURRENT OF c1;
Florin Ghita
  • 17,525
  • 6
  • 57
  • 76
0

I would suggest to use following -

DECLARE @genID VARCHAR(100) -- or appropriate datatype

SET @genID = genid() -- Or select @genID = genid()    
update mytable    
   set a = first_part(@genID),
       b = second_part(@genID),
       c = third_path(@genID)
where package_id = 10; 
Alok
  • 266
  • 2
  • 14
  • That's probably the path of least resistance, compared to my rather verbose solution... (if pgplsql is allowed in this case) – Lukas Eder Dec 30 '11 at 08:36
  • In this case `genid()` will be called only once for the whole table update, am I right? I need it to be called once for each row. – Cezariusz Dec 30 '11 at 09:53
  • Yes that will be called only once for whole table update. And in your case I think User Lukas has suggested most appropriate way to achieve this. – Alok Dec 30 '11 at 10:32
  • 2
    That won't work with PostgreSQL (you cannot declare variables like that) –  Dec 30 '11 at 13:37