2

I am using a trigger to log all the UPDATE and INSERT OPERATION, but my table has many columns, and I would like to log which column did I updated or inserted and their new and old values. The following is a demonstration of what I am trying to achieve:

big table                   
-------------------------   
id       |feild_1   | field_2| field_3 .....          
-----------------------------------------------     
usr1     |old_value1|   
usr2     |old_value2|  

suppose I change value in usr1 filed_1 to 'new_value1'. I would like to see in my change history :

change history
------------------------- 
id       |field_changed  | old_value | new_value |        
----------------------------------------------------------- 
usr1     |field_1        |old_value_1|new_value_1|  

The difficulty lies in achieving the name of the column I modified. The current trigger function I used is modified from https://stackoverflow.com/a/49872254/5749562. It looks like the following:

CREATE FUNCTION change_trigger() RETURNS trigger AS $$
       BEGIN
         IF TG_OP = 'INSERT'
         THEN INSERT INTO change_history (
                id, old_value, new_value
              ) VALUES (
                OLD.id, row_to_json(OLD), row_to_json(NEW)
              );
           RETURN NEW;
         ELSIF  TG_OP = 'UPDATE'
         THEN
           INSERT INTO change_history (
             id, old_value, new_value
           )
           VALUES (OLD.id, row_to_json(OLD), row_to_json(NEW));
           RETURN NEW;
         END IF;
       END;
$$ LANGUAGE 'plpgsql' SECURITY DEFINER;

As you can see all the columns are recorded during the change, which is not so informative for big table. How I can modify my trigger function?

Jason Zhou
  • 25
  • 3
  • You can use jsonb to calculate the "difference", see here: http://8kb.co.uk/blog/2015/01/19/copying-pavel-stehules-simple-history-table-but-with-the-jsonb-type/ –  Jan 13 '20 at 06:50

2 Answers2

1

PostgreSQL always updates the whole row, and there is no way to determine in the trigger which columns were listed in the SET clause of the UPDATE statement.

The best you can do is to figure out which columns get changed by the UPDATE, by testing

IF OLD.col IS DISTINCT FROM NEW.col
Laurenz Albe
  • 209,280
  • 17
  • 206
  • 263
  • 3
    I would rather suggest ```IF OLD.col IS DISTINCT FROM NEW.col```. For a list of the columns you can query ```pg_attribute``` (https://www.postgresql.org/docs/current/catalog-pg-attribute.html) – Islingre Jan 12 '20 at 22:51
  • Does this mean that I will need to iterate through all the columns and get the column where NEW and OLD have distinct values in the trigger function? If so, what would the code be like for the column iteration and column name retrieval? Any example will be appreciated. – Jason Zhou Jan 12 '20 at 23:13
  • You can use `information_schema.columns` for introspection. [This answer](https://stackoverflow.com/a/38715122/6464308) shows how to use dynamic SQL to extract a column from `NEW` using dynamic SQL. – Laurenz Albe Jan 13 '20 at 04:21
  • @JasonZhou Iterating (in pl/pgsql) is possible but can be tricky. See my answer using e-maj: the trick is to convert OLD tuple and NEW tuple to json then hstore, then do hstoreOLD - hstoreNEW - metadata_fields. – mayeulk Aug 13 '23 at 00:25
0

Below, I propose a solution, built with the e-maj software (a free, open source PostgreSQL extension), which helps here. It is a set of triggers which log all changes made to a database (or a subset of it), and store this in a separate schema in very detailed audit logs, on which you can make your SELECTs. What is logged is detailed at: https://emaj.readthedocs.io/en/latest/architecture.html#logged-sql-statements

E-Maj is available via the PostgreSQL Extension network (https://pgxn.org/dist/e-maj/).

If you log "myschema.mytable", e-maj will create a table "emaj_myschema.mytable_log". Each updates will be logged as two rows (old values, new values, ID and time of change...), as follows:

select * from emaj_myschema.mytable_log;

E-maj log table

But this is only the beginning. From this, you can do:

CREATE EXTENSION IF NOT EXISTS hstore;
-- This first json function was created by klin, https://stackoverflow.com/questions/34746807/migrate-json-to-hstore
create or replace function simple_jsonb_to_hstore(jdata jsonb)
returns hstore language sql immutable
as $$
    select hstore(array_agg(key), array_agg(value))
    from jsonb_each_text(jdata)
$$;

A query getting something close to what you wanted:

with newr AS (
select  emaj_gid, emaj_tuple, emaj_verb,
  simple_jsonb_to_hstore(to_jsonb(t1)
  ) rec_hstore
from (  select * from   emaj_myschema.mytable_log) t1
WHERE emaj_verb='UPD' AND emaj_tuple='NEW')
select emaj_changed, emaj_txid, emaj_user,
simple_jsonb_to_hstore(to_jsonb(oldr) )
  - newr.rec_hstore
  - ARRAY['emaj_tuple','emaj_changed'] as oldval,
newr.rec_hstore
  - simple_jsonb_to_hstore(to_jsonb(oldr) )
  - ARRAY['emaj_tuple','emaj_changed'] as newval
FROM (  select * from   emaj_myschema.mytable_log) oldr
JOIN newr ON oldr.emaj_gid = newr.emaj_gid
WHERE oldr.emaj_verb='UPD' AND oldr.emaj_tuple='OLD'
;

Result such as below: result of query on e-maj logs

If you frequently do this, create a function:

DROP FUNCTION IF EXISTS emaj.logs_of_changed_fields;
CREATE FUNCTION emaj.logs_of_changed_fields(schema_name text, table_name text)
 RETURNS TABLE (
     emaj_changed    timestamptz,
     emaj_txid       bigint,
     emaj_user       varchar,
     oldval          hstore, -- only the fields that changed, old values
     newval          hstore,-- only the fields that changed, new values
     newrecord       hstore -- new values for all fields (including unchanged ones)
 )
 AS $$
BEGIN
RETURN QUERY EXECUTE '
with newr AS (
select  emaj_gid, emaj_tuple, emaj_verb,
  simple_jsonb_to_hstore(to_jsonb(t1)
  ) rec_hstore
from (  select * from emaj_' || schema_name || '.' || table_name || '_log ) t1
WHERE emaj_verb=''UPD'' AND emaj_tuple=''NEW''
)
select emaj_changed, emaj_txid, emaj_user,
simple_jsonb_to_hstore(to_jsonb(oldr) )
  - newr.rec_hstore
  - ARRAY[''emaj_tuple'',''emaj_changed''] as oldval,
newr.rec_hstore
  - simple_jsonb_to_hstore(to_jsonb(oldr) )
  - ARRAY[''emaj_tuple'', ''emaj_changed''] as newval,
  newr.rec_hstore as newrecord
FROM (  select * from emaj_' || schema_name || '.' || table_name || '_log ) oldr
JOIN newr ON oldr.emaj_gid = newr.emaj_gid
WHERE oldr.emaj_verb=''UPD'' AND oldr.emaj_tuple=''OLD''
';
END;
$$ LANGUAGE PLPGSQL;

This will show you all updated fields, including when several fields were updated at the same time. If you want to further analyze updates to a single field, do:

select oldval -> 'field_1' AS old_value,
  newval  ->  'field_1' as new_value,
  * from emaj.logs_of_changed_fields('exo_public','t_acteur');

Further, the result from the SQL below has structure which is precisely that of your example:

select newrecord -> 'user_field' as user_field, (old).key,
   (old).value as old_value, (new).value as new_value
from (
  select newrecord,
    each(oldval) as old
         ,each(newval) as new
    
  from emaj.logs_of_changed_fields('exo_public','t_acteur')
) t;

Result as below (the 'key' field is the key of the hstore, that is your 'field_changed'): enter image description here

mayeulk
  • 100
  • 8