0

This code:

ALTER TABLE myschema.mytable add column geom geometry (point,4326);
CREATE INDEX mytable_idx on myschema.mytable using GIST(geom);
UPDATE myschema.mytable set geom = st_setsrid(st_point(mytable.long, mytable.lat), 4326);

This works fine when updating a single table. How would you convert it into a dynamic SQL function, with schema and table as parameters?

Encomium
  • 257
  • 4
  • 14
  • 1
    You're looking for dynamic SQL - `ALTER TABLE sch.tab` tries to alter the `"tab"` table in the schema `"sch"`, not resolving them to the parameters of the function. And your function expects two strings, so you'd need to call `SELECT create_geom('myschema', 'mytable');` – Bergi Jun 04 '21 at 16:18
  • Ah. Want to write that up so I can accept your answer? :) – Encomium Jun 04 '21 at 16:23
  • 1
    I don't know how to properly write dynamic SQL. I'd probably try making the function accept a [`regclass`](https://www.postgresql.org/docs/current/datatype-oid.html) though, not two strings. – Bergi Jun 04 '21 at 16:29
  • Thanks for the help; got it to work :) – Encomium Jun 04 '21 at 16:47

2 Answers2

2

Since the function input must be an existing table, the simplest safe way would be to use a regclass input parameter like demonstrated here:

However, you also need the bare table name for the concatenated index name, so I'll stick with taking text for schema and table separately:

CREATE OR REPLACE FUNCTION create_geom(_sch text, _tab text)
  RETURNS void
  LANGUAGE plpgsql AS
$func$
BEGIN
   EXECUTE format(   
'ALTER TABLE %1$I.%2$I ADD COLUMN geom geometry(POINT,4326);
UPDATE %1$I.%2$I SET geom = st_setsrid(st_point(long, lat), 4326);
CREATE INDEX %3$I ON %1$I.%2$I USING gist(geom);'
   , _sch, _tab
   , _tab || '_geom_gist_idx');
END
$func$;

Call:

SELECT create_geom('myschema', 'mytable');

Use a single EXECUTE, no need for multiple calls.

Just omit table-qualification for columns in the UPDATE. While not joining additional tables, column names are unambiguous. Else, use a table alias, which can be constant. Like:

UPDATE %1$s AS x SET geom = st_setsrid(st_point(x.long, x.lat), 4326);

But it's smarter to populate the column before you build the index. That's a lot faster and produces a balanced index without bloat. So I switched the commands.

Note how I concatenate the index name first (_tab || '_geom_gist_idx'), and then double-quote as required with %3$I. That's the safe way. Something like %I_idx fails with non-standard names.

That said, it's typically a mistake to add columns with redundant information to a table. (What keeps you from changing one or the other? Why bloat the table?) Either just use an expression index instead of all of the above:

CREATE INDEX ON myschema.mytable USING gist (st_setsrid(st_point(long, lat), 4326));

Or drop the now redundant long & lat from the table. Those can be extracted from the new geom cheaply on the fly.

Or, if you need all columns (for special performance reasons?), consider a generated column instead. See:

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
1

Having your queries as SQL templates and using format function for identifiers:

CREATE OR REPLACE FUNCTION public.create_geom(sch text, tab text)
RETURNS void language plpgsql AS $body$
DECLARE
 DYNSQLA constant text := 'ALTER TABLE %I.%I add column geom geometry (point,4326)';
 DYNSQLB constant text := 'CREATE INDEX %I_idx on %I.%I using GIST(geom);';
 DYNSQLC constant text := 'UPDATE %I.%I set geom = st_setsrid(st_point(%I.long, %I.lat), 4326)';
BEGIN
    execute format(DYNSQLA, sch, tab);
    execute format(DYNSQLB, tab, sch, tab);
    execute format(DYNSQLC, sch, tab, tab, tab);
END;
$body$;

SELECT create_geom('myschema','mytable');
Stefanov.sm
  • 11,215
  • 2
  • 21
  • 21
  • `%I_idx` could to produce wrong result; there is positional parameters like `%2I` to avoid something like `... tab, tab, tab ...` – Abelisto Jun 04 '21 at 19:26