0

We have multiple schemas, I would like to run a simple count query across schemas such as:

SELECT COUNT(col_x) FROM schema1.table WHENRE col_x IS NOT NULL

I saw that I'm able to get all the schemas with:

SELECT schema_name FROM information_schema.schemata

So by using:

set search_path to schema1; 
SELECT COUNT(col_x)
FROM table
WHERE col_x is not NULL;

I was able to run the query for schema1

The question is - is it possible to run in a loop and use the schema name as a parameter for search_path and run the query across all schemas? or any other efficient way to do so?

USer22999299
  • 5,284
  • 9
  • 46
  • 78
  • https://stackoverflow.com/questions/2596670/how-do-you-find-the-row-count-for-all-your-tables-in-postgres –  Oct 18 '20 at 14:41

1 Answers1

4

You will need some plpgsql and dynamic SQL for this. Here is an anonymous block for illustration:

do language plpgsql
$$
declare
 v_schema_name text;
 table_row_count bigint;
 sysSchema text[] := array['pg_toast','pg_temp_1','pg_toast_temp_1','pg_catalog','public','information_schema'];
 -- other declarations here
begin
 for v_schema_name in SELECT schema_name FROM information_schema.schemata WHERE (schema_name != ALL(sysSchema)) loop
   begin
     execute format('select count(col_x) from %I.t_table', v_schema_name)
     into table_row_count;
     raise notice 'Schema % count %', v_schema_name, table_row_count;
   exception when others then null;  -- t_table may not exists in some schemata
   end;
 -- other statements here
 end loop;
end;
$$;

And btw WHERE col_x is not NULL is redundant.

user
  • 5,335
  • 7
  • 47
  • 63
Stefanov.sm
  • 11,215
  • 2
  • 21
  • 21
  • Seems like the query workd, but raise notice 'Schema % count %', schema_name, table_row_count; was not printing anything wastrying to run change it to raise instead by same thing how can i validate the output? – USer22999299 Oct 18 '20 at 16:06
  • Pls. comment the `exception` line in order to see what's wrong. I edited my answer, format fits better than replace. – Stefanov.sm Oct 18 '20 at 16:11
  • Right - ERROR: column reference "schema_name" is ambiguous LINE 1: SELECT schema_name FROM information_schema.schemata – USer22999299 Oct 18 '20 at 16:17
  • My mistake. The "schema_name" variable and field had the same name. Edited my answer, changed the variable to "v_schema_name", should be OK now. – Stefanov.sm Oct 18 '20 at 16:22
  • I see a major problem. The exception handler. As written this is a a pure bug. If a an error occurs your routine terminates prematurely. Meaning you get the wrong answer and have **no way to find out why**. At a minimum log/print the error. Even better if you are not prepared to write corrective code then do not capture the exception. – Belayer Oct 18 '20 at 16:31
  • @Belayer you are right. This is an illustration only, but as you noticed a very wrong one. Yet another edit. The `exception` statement may be developed in order to handle the case correctly. – Stefanov.sm Oct 18 '20 at 16:33
  • Adding filtering to the sys schemas - working as charm now :) thanks! BTW - edit your answer – USer22999299 Oct 18 '20 at 16:37