21

I am writing a program that syncs PostgreSQL and MS SQL server databases (and adds some changes in this transition). With multi million records, it takes a long time, and loads server pretty bad with select *; it also takes more resources to parse unchanged records and validate them against MS SQL server.

Are there any logs in PostgreSQL that I can parse to find out the changes that took place in the last n-minutes? This would allow me to select only those records that I need to work; improving performance.

Andrew
  • 7,619
  • 13
  • 63
  • 117

2 Answers2

40

Postgresql, find changes in the last n minutes:

Postgresql does not automatically store the date or time that rows were added/updated/deleted (it would really slow things down to handle timestamps like this if you didn't want it to).

You'll have to do it yourself: Add a timestamp column to the table. When you insert a row into the table, have it update the timestamp column to the current_timestamp. When you are selecting the row, use a select statement that filters down where timestamp is greater than N minutes ago as follows:

Get rows where a timestamp is greater than a date:

SELECT * from yourtable 
WHERE your_timestamp_field > to_date('05 Dec 2000', 'DD Mon YYYY');

Get rows that have been changed in the last n minutes:

SELECT * from yourtable 
WHERE your_timestamp_field > current_timestamp - interval '5 minutes'

Walkthrough example

drop table foo; 
CREATE TABLE foo( 
    msg character varying(20), 
    created_date date, 
    edited_date timestamp 
); 
insert into foo values( 'splog adfarm coins', '2015-01-01', current_timestamp);
insert into foo values( 'execute order 2/3', '2020-03-15', current_timestamp);
insert into foo values( 'deploy wessels', '2038-03-15', current_timestamp);
 
select * from foo where created_date < to_date('2020-05-05', 'YYYY-mm-DD'); 
    ┌────────────────────┬──────────────┬────────────────────────────┐ 
    │        msg         │ created_date │        edited_date         │ 
    ├────────────────────┼──────────────┼────────────────────────────┤ 
    │ splog adfarm coins │ 2015-01-01   │ 2020-12-29 11:46:27.968162 │ 
    │ execute order 2/3  │ 2020-03-15   │ 2020-12-29 11:46:27.96918  │ 
    └────────────────────┴──────────────┴────────────────────────────┘ 
 
select * from foo where edited_date > to_timestamp(
    '2020-12-29 11:42:37.719412', 'YYYY-MM-DD HH24_MI_SS.US'); 
    ┌────────────────────┬──────────────┬────────────────────────────┐ 
    │        msg         │ created_date │        edited_date         │ 
    ├────────────────────┼──────────────┼────────────────────────────┤ 
    │ execute order 2/3  │ 2020-03-15   │ 2020-12-29 11:46:27.96918  │ 
    │ deploy wessels     │ 2038-03-15   │ 2020-12-29 11:46:27.969988 │ 
    └────────────────────┴──────────────┴────────────────────────────┘ 
Eric Leschinski
  • 146,994
  • 96
  • 417
  • 335
  • OK, inserts can be taken care of with timestamps, what about edits? – Andrew Dec 07 '12 at 17:52
  • 1
    When you edit the table, you could either update that same timestamp, or make an additional column called "date_updated" and when you update the row, mark that column as current time. Watchout for timezone changes, daylight savings times and other shenanigans with the time. – Eric Leschinski Dec 07 '12 at 17:54
  • Use **`trigger`** to update your_timestamp_field automatically when any INSERT/UPDATE is proceeded. You don't more need to remember 'SET your_timestamp' in all queries. – Evgeny Nozdrev Oct 16 '18 at 12:20
4

You can use a trigger based approach described here:

http://wiki.postgresql.org/wiki/Audit_trigger

Essentially every table change fires a trigger which can write some info in a log table.

sega_sai
  • 8,328
  • 1
  • 29
  • 38