0

In my database, I have the standard app tables and backup tables. Eg. for a table "employee", I have a table called "bak_employee". The bak_employee table is a backup of the employee table. I use it to restore the employee table between tests.

I'd figure I can use these "bak_" tables to see the changes that have occurred during the test like this:

SELECT * FROM employee EXCEPT SELECT * FROM bak_employee

This will show me the inserted and updated records. I'll ignore the deleted records for now.

Now, what I would like to do is go through all the tables in my database to see if there's any changes in any of the tables. I was thinking of doing this as a function so it's easy to call over and over. This is what I have so far:

CREATE OR REPLACE FUNCTION public.show_diff()
  RETURNS SETOF diff_tables AS
$BODY$

DECLARE
  app_tables text;
BEGIN

    FOR app_tables IN
        SELECT table_name  

        FROM   information_schema.tables 

        WHERE  table_catalog = 'myDatabase' 
          AND  table_schema = 'public'
          AND  table_name not like 'bak_%'          -- exclude existing backup tables
    LOOP

        -- somehow loop through tables to see what's changed something like:
        EXECUTE 'SELECT * FROM ' || app_tables || ' EXCEPT SELECT * FROM bak_' || app_tables;

    END LOOP;

    RETURN;
END;
$BODY$
LANGUAGE plpgsql;

But obviously this isn't going to return me any useful information. Any help would be appreciated.

Steven
  • 714
  • 2
  • 8
  • 21
  • 1
    Unrelated, but: the `except` won't show you all differences. something like `select * from employee e full outer join bak_employee b on e.id = b.id where e is distinct from b` would be better. –  May 31 '16 at 17:18
  • This can't really be done with a "universal" function because the function can return only one "thing". But those tables will have different number of columns and a result set needs to have the same number of columns for **all** rows. What could work is to return each of the rows as a JSON document or a `hstore` that contains the modified columns as keys. –  May 31 '16 at 17:21

1 Answers1

3

You cannot return different well-known row types from the same function in the same call. A cheap fix is to cast each row type to text, so we have a common return type:

CREATE OR REPLACE FUNCTION public.show_diff()
  RETURNS SETOF text
  LANGUAGE plpgsql AS   -- text!!
$func$
DECLARE
   app_table text;
BEGIN
   FOR app_table IN
      SELECT table_name  
      FROM   information_schema.tables 
      WHERE  table_catalog = 'myDatabase' 
      AND    table_schema = 'public'
      AND    table_name NOT LIKE 'bak_%'   -- exclude existing backup tables
   LOOP
      RETURN NEXT ' ';
      RETURN NEXT '=== ' || app_table || ' ===';
      RETURN QUERY EXECUTE format(
        'SELECT x::text FROM (TABLE %I EXCEPT ALL TABLE %I) x'
       , app_table, 'bak_' || app_table);
   END LOOP;

   RETURN;
END
$func$;

Call:

SELECT * FROM public.show_diff();

I had the test suggested by @a_horse at first, but after your comment I realized that there is no need for this. EXCEPT considers NULL values to be equal and shows all differences.

While being at it, I improved and simplified your solution some more. Use EXCEPT ALL: cheaper and does not run the risk of folding complete duplicates.

TABLE is just syntactical sugar. See:

However, if you have an index on a unique (combination of) column(s), a JOIN like I suggested before should be faster: finding the only possible duplicate via index should be substantially cheaper.

Crucial element is the cast the row type to text (x::text).

You can even make the function work for any table - but never more than one at a time: With a polymorphic parameter type:

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • Awesome!!!! Love your solution. Except I changed the select statement a bit:SELECT x::text FROM (SELECT * FROM %I EXCEPT SELECT * FROM %I) as x – Steven Jun 10 '16 at 14:30
  • @Steven: You are right, `EXCEPT` should work just fine. Or better, yet: `EXCEPT ALL`. The join *may* be faster, though. Consider the update. – Erwin Brandstetter Jun 10 '16 at 16:16