3

I wanted to make an update function for a schema in PG SQL database. A test function is below. It doesn't work because it should never raise a notice, but will do that when running test_schema_update('second').

CREATE OR REPLACE FUNCTION test_schema_update(my_schema_name VARCHAR(200)) 
RETURNS void AS
$__marker__$
DECLARE
    actualValue varchar(1000);
    testValue varchar(1000);
BEGIN
    EXECUTE 'SET search_path TO ' || quote_ident(my_schema_name);

    testValue := (SELECT max(value) FROM setting WHERE settingkey = 'libraryname');
    EXECUTE ('SELECT max(value) FROM setting WHERE settingkey = ''libraryname''')
        INTO actualValue;

    IF (actualValue != testValue)
    THEN
        RAISE NOTICE '% != %', actualValue, testValue;
        RAISE INFO 'Schema was: %', current_schema();
    END IF;

    RESET search_path;
END;
$__marker__$ LANGUAGE plpgsql;

test_schema_update('first');
test_schema_update('second');

The problem is that PG SQL seem to analyze SELECT statements only once per session and then tables are fixed to specific schema. What is interesting is that you will get Schema was: second.

So is there a way reset SELECT statement analysis or some other way to work around this?

Side note: All schema creation function (ALTER TABLE, CREATE TABLE...) work fine. Only data manipulation functions seem to be affected (SELECT, INSERT, UPDATE).

Workaround

Before:

IF (
    SELECT max(id) FROM dimtime
)
THEN
    INSERT INTO dimtime SELECT * FROM public.src_dimtime;
END IF;

After:

EXECUTE ('
    SELECT max(id) FROM dimtime
')
INTO testInt;
IF (testInt IS NULL)
THEN
    EXECUTE 'INSERT INTO dimtime SELECT * FROM public.src_dimtime';
END IF;

Edit: The problem occurs in PostgreSQL 9.2, but seem not to occur in 9.3. Maybe it was fixed?

Nux
  • 9,276
  • 5
  • 59
  • 72
  • That syntax only allows a static schema to be set (not from a parameter). – Nux Sep 04 '14 at 14:52
  • 1
    This should work; dump please `s1 := current_schema()` vs. `execute 'select current_schema()' into s2` & tag the specific postgres version -- don't use `RESET` (that won't re-set the previous configuration, the default will be restored), use `SET LOCAL` instead at start, so you can leave `RESET`. – pozs Sep 04 '14 at 15:15

1 Answers1

5

That behaviour is to be expected. The reason is that PL/pgSQL uses plan caching for SQL statements, internally using standard prepared statements.

Per documentation:

As each expression and SQL command is first executed in the function, the PL/pgSQL interpreter creates a prepared execution plan (using the SPI manager's SPI_prepare and SPI_saveplan functions). Subsequent visits to that expression or command reuse the prepared plan.

This is also the reason, why plpgsql functions are often faster than plain SQL functions for complex operations:

Prepared statements are saved for the lifetime of the session, not just the transaction (but invalidated when underlying objects change, this is safe with concurrent access). The documentation once again:

Once PL/pgSQL has made an execution plan for a particular command in a function, it will reuse that plan for the life of the database connection. This is usually a win for performance, but it can cause some problems if you dynamically alter your database schema.

Bold emphasis mine.

If you want to "change" the schema of a table name, you are really going to refer to a completely different table and need to use dynamic SQL with EXECUTE, which generates a new plan every time (with all advantages and disadvantages):

Because PL/pgSQL saves execution plans in this way, SQL commands that appear directly in a PL/pgSQL function must refer to the same tables and columns on every execution; that is, you cannot use a parameter as the name of a table or column in an SQL command. To get around this restriction, you can construct dynamic commands using the PL/pgSQL EXECUTE statement — at the price of constructing a new execution plan on every execution.

Read the referenced chapter in the manual. It's quite comprehensive.

Code example

You don't need dynamic SQL for your added code example, and a single statement will be faster:

INSERT INTO dimtime  -- you may want list columns
SELECT *             -- here as well
FROM   public.src_dimtime
WHERE  NOT EXISTS (SELECT 1 FROM dimtime);
Community
  • 1
  • 1
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • Thanks. I've manged to workaround this restrictions, but EXECUTE on everything is quite painful (added example in my question). – Nux Sep 05 '14 at 14:13
  • @Nux: You don't need `EXECUTE` on *everything*. Only the statements where tables / columns are changed dynamically. The example in your question can be plain SQL - and simplified. I added a bit to my answer. – Erwin Brandstetter Sep 05 '14 at 15:20
  • I'm making update of many schemas at a time. In other words there is `client1.dimtime`, `client2.dimtime` and so on. My goal was to do as much as possible in SQL. In other function I get schema names from `INFORMATION_SCHEMA.schemata`. So the `INSERT` must be in `schema_update` function and must be called within the context of a specific schema. – Nux Sep 06 '14 at 22:18
  • 1
    @Nux: Yes, you need dynamic SQL in that case. But use my simplified query. One query is substantially cheaper than two. – Erwin Brandstetter Sep 07 '14 at 00:19