Table name as a PostgreSQL function parameter

Question

I want to pass a table name as a parameter in a Postgres function. I tried this code:

CREATE OR REPLACE FUNCTION some_f(param character varying) RETURNS integer 
AS $$
    BEGIN
    IF EXISTS (select * from quote_ident($1) where quote_ident($1).id=1) THEN
     return 1;
    END IF;
    return 0;
    END;
$$ LANGUAGE plpgsql;

select some_f('table_name');

And I got this:

ERROR:  syntax error at or near "."
LINE 4: ...elect * from quote_ident($1) where quote_ident($1).id=1)...
                                                             ^

********** Error **********

ERROR: syntax error at or near "."

And here is the error I got when changed to this select * from quote_ident($1) tab where tab.id=1:

ERROR:  column tab.id does not exist
LINE 1: ...T EXISTS (select * from quote_ident($1) tab where tab.id...

Probably, quote_ident($1) works, because without the where quote_ident($1).id=1 part I get 1, which means something is selected. Why may the first quote_ident($1) work and the second one not at the same time? And how could this be solved?

I know this question is kind of old, but I found it while searching for the answer to another issue. Couldn't your function just query the informational_schema? I mean, that's kind of what it's for in a way - to let you query and see what objects exist in the database. Just an idea. — David S, Sep 25 '12 at 21:24

Erwin Brandstetter · Accepted Answer · 2022-10-17T21:54:51.313

Before you go there: for only few, known tables names, it's typically simpler to avoid dynamic SQL and spell out the few code variants in separate functions or in a CASE construct.

That said, what you are trying to achieve can be simplified and improved:

CREATE OR REPLACE FUNCTION some_f(_tbl regclass, OUT result integer)
    LANGUAGE plpgsql AS
$func$
BEGIN
   EXECUTE format('SELECT (EXISTS (SELECT FROM %s WHERE id = 1))::int', _tbl)
   INTO result;
END
$func$;

Call with schema-qualified name (see below):

SELECT some_f('myschema.mytable');  -- would fail with quote_ident()

Or:

SELECT some_f('"my very uncommon table name"');

Major points

Use an OUT parameter to simplify the function. You can directly select the result of the dynamic SQL into it and be done. No need for additional variables and code.

EXISTS does exactly what you want. You get true if the row exists or false otherwise. There are various ways to do this, EXISTS is typically most efficient.

You seem to want an integer back, so I cast the boolean result from EXISTS to integer, which yields exactly what you had. I would return boolean instead.

I use the object identifier type regclass as input type for _tbl. That does everything quote_ident(_tbl) or format('%I', _tbl) would do, but better, because:

.. it prevents SQL injection just as well.
.. it fails immediately and more gracefully if the table name is invalid / does not exist / is invisible to the current user. (A regclass parameter is only applicable for existing tables.)
.. it works with schema-qualified table names, where a plain quote_ident(_tbl) or format(%I) would fail because they cannot resolve the ambiguity. You would have to pass and escape schema and table names separately.

It only works for existing tables, obviously.

I still use format(), because it simplifies the syntax (and to demonstrate how it's used), but with %s instead of %I. Typically, queries are more complex so format() helps more. For the simple example we could as well just concatenate:

EXECUTE 'SELECT (EXISTS (SELECT FROM ' || _tbl || ' WHERE id = 1))::int'

No need to table-qualify the id column while there is only a single table in the FROM list. No ambiguity possible in this example. (Dynamic) SQL commands inside EXECUTE have a separate scope, function variables or parameters are not visible there - as opposed to plain SQL commands in the function body.

Here's why you always escape user input for dynamic SQL properly:

db<>fiddle here demonstrating SQL injection
_{Old sqlfiddle}

@suhprano: Sure. Try it: `DO $$BEGIN EXECUTE 'ANALYZE mytbl'; END$$;` — Erwin Brandstetter, Feb 23 '15 at 20:58
@Lotus: The explanation is in the answer. `regclass` values are escaped automatically when output as text. `%L` would be *wrong* in this case. — Erwin Brandstetter, Jun 02 '17 at 14:49
`CREATE OR REPLACE FUNCTION table_rows(_tbl regclass, OUT result integer) AS $func$ BEGIN EXECUTE 'SELECT (SELECT count(1) FROM ' || _tbl || ' )::int' INTO result; END $func$ LANGUAGE plpgsql;` create a table row count function, `select table_rows('nf_part1');` — Ferris, May 22 '19 at 12:34
@Ashish: See last chapter of https://stackoverflow.com/a/11751557/939860 — Erwin Brandstetter, Sep 11 '21 at 21:55

ErikE · Answer 2 · 2020-03-24T16:55:32.450

If at all possible, don't do this.

That's the answer—it's an anti-pattern. If the client knows the table it wants data from, then SELECT FROM ThatTable. If a database is designed in a way that this is required, it seems to be designed sub-optimally. If a data access layer needs to know whether a value exists in a table, it is easy to compose SQL in that code, and pushing this code into the database is not good.

To me this seems like installing a device inside an elevator where one can type in the number of the desired floor. After the Go button is pressed, it moves a mechanical hand over to the correct button for the desired floor and presses it. This introduces many potential issues.

Please note: there is no intention of mockery, here. My silly elevator example was *the very best device I could imagine* for succinctly pointing out issues with this technique. It adds a useless layer of indirection, moving table name choice from a caller space (using a robust and well-understood DSL, SQL) into a hybrid using obscure/bizarre server-side SQL code.

Such responsibility-splitting through movement of query construction logic into dynamic SQL makes the code harder to understand. It violates a standard and reliable convention (how a SQL query chooses what to select) in the name of custom code fraught with potential for error.

Here are detailed points on some of the potential problems with this approach:

Dynamic SQL offers the possibility of SQL injection that is hard to recognize in the front end code or the back end code alone (one must inspect them together to see this).
Stored procedures and functions can access resources that the SP/function owner has rights to but the caller doesn't. As far as I understand, without special care, then by default when you use code that produces dynamic SQL and runs it, the database executes the dynamic SQL under the rights of the caller. This means you either won't be able to use privileged objects at all, or you have to open them up to all clients, increasing the surface area of potential attack to privileged data. Setting the SP/function at creation time to always run as a particular user (in SQL Server, EXECUTE AS) may solve that problem, but makes things more complicated. This exacerbates the risk of SQL injection mentioned in the previous point, by making the dynamic SQL a very enticing attack vector.
When a developer must understand what the application code is doing in order to modify it or fix a bug, he'll find it very difficult to get the exact SQL query being executed. SQL profiler can be used, but this takes special privileges and can have negative performance effects on production systems. The executed query can be logged by the SP but this increases complexity for questionable benefit (requiring accommodating new tables, purging old data, etc.) and is quite non-obvious. In fact, some applications are architected such that the developer does not have database credentials, so it becomes almost impossible for him to actually see the query being submitted.
When an error occurs, such as when you try to select a table that doesn't exist, you'll get a message along the lines of "invalid object name" from the database. That will happen exactly the same whether you're composing the SQL in the back end or the database, but the difference is, some poor developer who's trying to troubleshoot the system has to spelunk one level deeper into yet another cave below the one where the problem exists, to dig into the wonder-procedure that Does It All to try to figure out what the problem is. Logs won't show "Error in GetWidget", it will show "Error in OneProcedureToRuleThemAllRunner". This abstraction will generally make a system worse.

An example in pseudo-C# of switching table names based on a parameter:

string sql = $"SELECT * FROM {EscapeSqlIdentifier(tableName)};"
results = connection.Execute(sql);

While this does not eliminate every possible issue imaginable, the flaws I outlined with the other technique are absent from this example.

I don't completely agree with that. Say, you press this "Go" button and then some mechanism checks, if the floor exists. Functions may be used in triggers, which in turn can check some conditions. This desicion may not be the most beautiful, but if the system is big enough already and you need to make some corrections in its logic, well, this choice is not so dramatic, I suppose. — John Doe, Jul 11 '12 at 05:04
But consider that the action of trying to press a button that doesn't exist will simply generate an exception no matter how you handle it. You can't actually push a nonexistent button, so there is no benefit to adding, on top of button-pushing, a layer to check for nonexistent numbers, since such number entry didn't exist before you created said layer! Abstraction is in my opinion the most powerful tool in programming. However, adding a layer that merely poorly duplicates an existing abstraction is **wrong**. The database itself is *already* an abstraction layer that maps names to data sets. — ErikE, Dec 31 '13 at 21:02
Spot on. The whole point of SQL is to express the set of data that you want extracted. The only thing that this function does is to encapsulate a "canned" SQL statement. Given the fact that the identifier is also hard coded the whole thing has a bad smell to it. — Nick Hristov, May 19 '14 at 16:14
What about a package to guess the database structure, tablenames, etc, and at some moment it also wants to known the number of rows in the tables it finds. — arivero, Jan 17 '15 at 20:08
@arivero "a package"? What are we talking about here? Can't the package construct its query using the known table name? You're not proposing anything different than what I already addressed. — ErikE, Jan 27 '15 at 23:43
I have a database provided by a 3rd party application. I want to find all the tables that have been updated since a timestamp on a record in that database. I think I can do that with an EXECUTE, but obviously that's not the correct way. What is the correct way? — AixNPanes, Apr 29 '15 at 14:36
Please ask a new question. Comments aren't a good way to get answers. — ErikE, Apr 30 '15 at 16:28
This is the same logic that was used to scare everybody away from `goto`s in C code. It's generally good advice, but it's not always a bad thing. — three-cups, May 28 '15 at 21:38
@three Until someone is in the *mastery* phase (see [the Dreyfus model of skill acquisition](http://en.wikipedia.org/wiki/Dreyfus_model_of_skill_acquisition)) of a skill, he should simply absolutely obey rules like "do NOT pass table names into a procedure to be used in dynamic SQL". Even hinting that it's not always bad is is itself *bad advice*. Knowing this, the beginner will be tempted to use it! That's BAD. Only masters of a topic should be breaking the rules, as they're the only ones with the experience to know in any particular case whether such rule-breaking actually makes sense. — ErikE, May 29 '15 at 22:07
@ErikE You're right, you should know what you're doing before using a non-standard approach. I might focus more energy into explaining why it's a bad idea instead of mocking technique in question. — three-cups, Jun 01 '15 at 21:33
@three-cups I did update with a lot more detail on why it's a bad idea. — ErikE, Apr 08 '16 at 22:58
"*At least in SQL Server, dynamic SQL executes with the privileges of the caller, not the running code.*" - what do you mean with that? Typically any statement you run, runs with the privileges of the caller. And besides: in Postgres (and SQL Server) you can define under which "privileges" a function is executed: either the owner/creator of the function or the current user (=caller) — , Sep 30 '16 at 21:43
@a_horse_with_no_name Updated with a much better description for the privilege concerns I was thinking of. — ErikE, Sep 30 '16 at 21:54
Well, in Postgres dynamic SQL runs with the privileges that were defined for the function. You can't break out of that, just because of dynamic SQL. But I still don't get your concerns related to that. Executing the code with the privileges of the caller is what the caller will typically expect. And if you create a function that runs with elevated privileges (which is what "security definer" usually is used for) then you shouldn't be allowing the caller to pass the SQL to be executed. — , Sep 30 '16 at 22:07

Daniel Vérité · Answer 3 · 2012-05-22T23:56:35.580

10

Inside plpgsql code, The EXECUTE statement must be used for queries in which table names or columns come from variables. Also the IF EXISTS (<query>) construct is not allowed when query is dynamically generated.

Here's your function with both problems fixed:

CREATE OR REPLACE FUNCTION some_f(param character varying) RETURNS integer 
AS $$
DECLARE
 v int;
BEGIN
      EXECUTE 'select 1 FROM ' || quote_ident(param) || ' WHERE '
            || quote_ident(param) || '.id = 1' INTO v;
      IF v THEN return 1; ELSE return 0; END IF;
END;
$$ LANGUAGE plpgsql;

edited May 22 '12 at 23:56

answered May 22 '12 at 18:49

Daniel Vérité

58,074
15
129
156

Thank you, I was making just the same a couple of minutes ago when read your answer. The only difference is I had to remove `quote_ident()` because it added extra quotes, which surprised me a little, well, 'cause it is used in most examples. – John Doe May 22 '12 at 19:01
Those extra quotes will be needed if/when the table name contains characters outside [a-z], or if/when it clashes with a reserved identifier (example: "group" as a table name) – Daniel Vérité May 22 '12 at 19:09
And, by the way, could you please provide a link that would prove that `IF EXISTS ` construct doesn't exist? I'm pretty sure I saw something like that as a working code sample. – John Doe May 22 '12 at 21:44
1

@JohnDoe: `IF EXISTS () THEN ...` is a perfectly valid construct in plpgsql. Just not with dynamic SQL for ``. I use it a lot. Also, this function can be improved quite a bit. I posted an answer. – Erwin Brandstetter May 22 '12 at 23:10
1

Sorry, you're right about `if exists()`, it's valid in the general case. Just checked and modified the answer accordingly. – Daniel Vérité May 22 '12 at 23:59
@john 'SELECT 1' inside an EXISTS clause does not create a 1-column table. It's time to stop correcting professionals with rudimentary and unlearned reasoning. – ErikE Jul 11 '12 at 05:51
@ErikE Thanks for mentioning that (though rather emotionally). – John Doe Jul 11 '12 at 06:47

score 9 · Answer 4 · answered May 18 '20 at 18:47

I know this is an old thread, but I ran across it recently when trying to solve the same problem - in my case, for some fairly complex scripts.

Turning the entire script into dynamic SQL is not ideal. It's tedious and error-prone work, and you lose the ability to parameterize: parameters must be interpolated into constants in the SQL, with bad consequences for performance and security.

Here's a simple trick that lets you keep the SQL intact if you only need to select from your table - use dynamic SQL to create a temporary view:

CREATE OR REPLACE FUNCTION some_f(_tbl varchar) returns integer
AS $$
BEGIN
    drop view if exists myview;
    execute format('create temporary view myview as select * from %s', _tbl);
    -- now you can reference myview in the SQL
    IF EXISTS (select * from myview where myview.id=1) THEN
     return 1;
    END IF;
    return 0;
END;
$$ language plpgsql;

It's even an older thread now :). Just in case, "temporary" requires the schema to be temporary as well. You can omit that keyword and do the cleaning as needed. Apart from the orthodox discussions here, it's a useful technique at least for some admin tasks. — full.stack.ex, Jul 14 '21 at 20:12

score 4 · Answer 5 · answered May 22 '12 at 16:35

The first doesn't actually "work" in the sense that you mean, it works only in so far as it does not generate an error.

Try SELECT * FROM quote_ident('table_that_does_not_exist');, and you will see why your function returns 1: the select is returning a table with one column (named quote_ident) with one row (the variable $1 or in this particular case table_that_does_not_exist).

What you want to do will require dynamic SQL, which is actually the place that the quote_* functions are meant to be used.

Thanks a lot, Matt, `table_that_does_not_exist` gave the same result, you're right. — John Doe, May 22 '12 at 17:26

score 2 · Answer 6 · answered Dec 21 '17 at 14:24

If the question was to test if the table is empty or not (id=1), here is a simplified version of Erwin's stored proc :

CREATE OR REPLACE FUNCTION isEmpty(tableName text, OUT zeroIfEmpty integer) AS
$func$
BEGIN
EXECUTE format('SELECT COALESCE ((SELECT 1 FROM %s LIMIT 1),0)', tableName)
INTO zeroIfEmpty;
END
$func$ LANGUAGE plpgsql;

score 2 · Answer 7 · answered Aug 28 '18 at 10:35

If you want table name, column name and value to be dynamically passed to function as parameter

use this code

create or replace function total_rows(tbl_name text, column_name text, value int)
returns integer as $total$
declare
total integer;
begin
    EXECUTE format('select count(*) from %s WHERE %s = %s', tbl_name, column_name, value) INTO total;
    return total;
end;
$total$ language plpgsql;


postgres=# select total_rows('tbl_name','column_name',2); --2 is the value

dm3 · Answer 8 · 2015-05-26T22:55:05.997

-2

I have 9.4 version of PostgreSQL and I always use this code:

CREATE FUNCTION add_new_table(text) RETURNS void AS
$BODY$
begin
    execute
        'CREATE TABLE ' || $1 || '(
        item_1      type,
        item_2      type
        )';
end;
$BODY$
LANGUAGE plpgsql

And then:

SELECT add_new_table('my_table_name');

It works good for me.

Attention! Above example is one of those which shows "How do not if we want to keep safety during querying the database" :P

edited May 26 '15 at 22:55

answered Apr 30 '15 at 12:15

dm3

1
2

1

Creating a `new` table is different from operating with the name of an existing table. Either way, you should escape text parameters executed as code or you are open to SQL injection. – Erwin Brandstetter May 08 '15 at 20:35
Oh, yeah, my mistake. The topic misled me and in addition I didn't read it to the end. Normally in my case. :P Why code with a text parameter is exposed to injection? – dm3 May 08 '15 at 23:31
Oops, it's really dangerous. Thank you for the answer! – dm3 May 10 '15 at 16:33

Table name as a PostgreSQL function parameter

8 Answers8

Major points

Linked

Related