0

I want to UPDATE several rows in the table myTable (335379 rows).

CREATE OR REPLACE FUNCTION costs_f(
someFloat float) RETURNS void AS
$$
BEGIN
UPDATE ways 
SET cost_time = CASE WHEN $1 = -1.0 THEN -1 ELSE anotherFloat * $1 END
FROM (SELECT w.gid AS id,
mc.name,
w.someCosts
FROM myTable mt
JOIN myClasses mc
ON mt.class_id = mc.class_id) AS tempTable
WHERE gid = id AND tempTable.name_name = $2;
END
$$
language 'plpgsql';

This function is then called in another function

CREATE OR REPLACE FUNCTION update_costs_f(
someFloat float) RETURNS void AS
$$
DECLARE
someArr varchar[] := ARRAY['a', 'b', 'c', 'd',
    'e', 'f', 'g', 'h', 'i', 
    'j', 'k', 'l',
    'm', 'n', 'o', 'p', 
    'q', 'r', 's', 't', 'ul', 'v'];
i text;
BEGIN
FOREACH i IN ARRAY someArr LOOP
    PERFORM costs_f($1, i);
END LOOP;
END
$$
language 'plpgsql';

Then I do

SELECT update_costs_f(10.0);

However, that takes very very long! Is there a way to improve speed?

*Note: This code is abstracted. In the second function there are more ForEach loop. I have several arrays.

four-eyes
  • 10,740
  • 29
  • 111
  • 220
  • 2
    You obfuscated your code to an extent that it's no longer valid. You are passing a character string to `costs_f()` inside the loop, but the function expects a float value. The function also uses a variable `anotherfloat` that is not declared anywhere. And, unrelated but: comparing a float for equality can have many surprising results: http://floating-point-gui.de/ –  Jan 18 '17 at 09:49
  • It's still invalid - but the example given can be reduced to a single update statement and no function needed. So your abstraction is not helping. –  Jan 18 '17 at 09:53
  • @a_horse_with_no_name of course I can write it into one statement. But these function get called at different spots from different parts of the program. So I need them to be in two different functions. What I am interested in is, why the update takes so long when I execute it the way I execute the function. – four-eyes Jan 18 '17 at 09:56
  • 3
    Because you are doing a slow row-by-row update. A single statement updating many rows is always faster then many statements updating a single row. The only way to improve this, is to get rid of the slow, inefficient and non-scalable row-by-row processing –  Jan 18 '17 at 09:57
  • @a_horse_with_no_name ah, thanks. Could you give me a hint what to look for? – four-eyes Jan 18 '17 at 10:00

1 Answers1

1

1. Get rid of tempTable.name_name = $2;

As @a_horse_with_no_name said, you should not be running multiple update statements.

I see you are calling PERFORM costs_f($1, i) multiple times for each letter in someArr. Instead of calling it multiple times, call it once and use the in operator, e.g. :

AND tempTable.name_name in($2);

2. Split the Update into 2 Statements

You should split your update statement into two statements:

UPDATE ways 
SET cost_time = -1.0 FROM tempTable 
WHERE gid = id AND tempTable.name_name = $2 AND
$1 = -1.0 ;

UPDATE ways 
SET cost_time =anotherFloat * $1 FROM tempTable 
WHERE gid = id AND tempTable.name_name = $2 AND
NOT ($1 = -1.0) ;

Note: Simplified by only using tempTable instead of the full nested subuery/alias

3. Move conditions into the nested Aliased From SubQuery

Right off the bat I can see you can move the condition name_name = $2 directly into the from aliased subquery:

FROM (SELECT w.gid AS id,
mc.name,
w.someCosts
FROM myTable mt
JOIN myClasses mc
ON mt.class_id = mc.class_id 
WHERE name_name = $2 ) AS tempTable
WHERE gid = id;

All Together

CREATE OR REPLACE FUNCTION update_costs_f(
someFloat float) RETURNS void AS
$$
DECLARE
someArr varchar[] := ARRAY['a', 'b', 'c', 'd',
    'e', 'f', 'g', 'h', 'i', 
    'j', 'k', 'l',
    'm', 'n', 'o', 'p', 
    'q', 'r', 's', 't', 'ul', 'v'];
i text;
BEGIN
PERFORM costs_f($1, someArr );
END
$$
language 'plpgsql';


CREATE OR REPLACE FUNCTION costs_f(
someFloat float) RETURNS void AS
$$
BEGIN
UPDATE ways 
SET cost_time = -1.0
FROM (SELECT w.gid AS id,
mc.name,
w.someCosts
FROM myTable mt
JOIN myClasses mc
ON mt.class_id = mc.class_id
WHERE $1 = -1.0 AND gid = id AND tempTable.name_name = ANY($2);
) AS tempTable;    

UPDATE ways 
SET cost_time =anotherFloat * $1 
FROM (SELECT w.gid AS id,
mc.name,
w.someCosts
FROM myTable mt
JOIN myClasses mc
ON mt.class_id = mc.class_id
WHERE NOT($1 = -1.0) AND gid = id AND tempTable.name_name =ANY($2);
) AS tempTable;

END
$$
language 'plpgsql';

Finally

You could use an IF statement in order to not run both SQL update statements...

See: PostgreSQL IF statement Example:

IF ($1 = -1.0) THEN
   ...
ELSE 
  ...
END IF;
Community
  • 1
  • 1
Menelaos
  • 23,508
  • 18
  • 90
  • 155
  • Thanks. Your are missing a function argument in `costs_f`. When calling it, (`PERFORM costs_f($1, someArr );` your passing `$1` and `someArr` but you are not defining it `CREATE OR REPLACE FUNCTION costs_f( someFloat float) RETURNS void AS` here. – four-eyes Jan 18 '17 at 13:53
  • @Stophface Yeh... but have you tried the change to check the speed? – Menelaos Jan 18 '17 at 14:08
  • I am trying to implement. Its mocking around with the array. I changed `CREATE OR REPLACE FUNCTION costs_f( someFloat float someText text[]) RETURNS void AS` and `SELECT PERFORM costs_f($1, someArr::text[]);` but its saying `ERROR: operator does not exist: text = text[] LINE 10: WHERE $1 = -1.0 AND gid = id AND tempTable.name_name in ($2) HINT: No operator matches the given name and argument type(s). You might need to add explicit type casts.` And the fancy `^` in the Error Message is pointing to `IN` – four-eyes Jan 18 '17 at 14:13
  • @Stophface http://www.dbrnd.com/2015/09/string-array-input-parameter-in-postgresql/ ? `ArrayText character varying[]` . Try also instead of IN, the ANY function. http://stackoverflow.com/q/30298031/1688441 ... `ANY(ArrayText)` – Menelaos Jan 18 '17 at 14:19
  • I tried that the first time. I changed to `CREATE OR REPLACE FUNCTION costs_f( someFloat float someText character varying[])` and the typecast as well `:: character varying[]`. Same error. Same results with `varchar[]`. I tried changing `IN` to `ANY` already. `ERROR: syntax error at or near "ANY" text = text[] LINE 10: WHERE $1 = -1.0 ANY gid = id AND` – four-eyes Jan 18 '17 at 14:23
  • hmmmm the problem seems to be `WHERE $1 = -1.0 AND gid = id AND tempTable.name_name in ($2)` .... I think you may need to cast $2 as an array. – Menelaos Jan 18 '17 at 14:26
  • Yep, for sure. But how? Shouldnt that be one of `::text[] ::varchar[]` or `::character varying[]`? – four-eyes Jan 18 '17 at 14:26
  • It has to be consistent.. so either varying everywhere or varchar. Try varying and change the definition of the someArr array as well. I think though you need to use the same in all places of both functions. – Menelaos Jan 18 '17 at 14:28
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/133461/discussion-between-stophface-and-maythesource-com). – four-eyes Jan 18 '17 at 14:31