15

I'm currently expecting for somebody to advice me on the process which I'm gonna take forward for DB archiving.

I've database (DB-1) which has 2 very large tables, one table having 25 GB of data and another is 20 GB of data. Which cause major performance issues even I have indexes.

So, we considered to archive the old data with the below process,

  1. Clone a new database (DB-2) from existing database (DB-1).
  2. Delete the old data from DB-1, so it will have only the last 2 years records. In case If I need old data can connect DB-2.
  3. Every month should move an old data from DB-1 to DB-2, and delete the moved rows from DB-1.
King_Fisher
  • 1,171
  • 8
  • 21
  • 51

2 Answers2

22

That is the wrong approach.

What you are looking for is partitioning.

You can create range partitions covering one year each. To remove old data all you need to do is to drop the partition for the year(s) no longer needed.

If you need to keep the data for some reasons, you can also just detach the partition from the table. Then the data is still "lying around", but would not show up in the (partitioned) table. You could query the (detached) partition directly to access that data. You could even move that (detached) partition to a slower harddisk to free up space on your fast disks if you have more than one.

But you might even see that partitioning alone might already improves performance, but that depends a lot on your queries.

Note that you should use Postgres 11 for that, as partitioning wasn't that sophisticated in older versions.

  • Thanks , I have EnterpriseDB 9.5.17.23 , I was thinking about partitioning, but If I partition the table, I cant merge it back if I did wrongly or I needed ? and thought will cause issues when run reports for 5 -10 years of old data ? – King_Fisher Jul 11 '19 at 12:40
  • @King_Fisher: partitioning **is** the way to go here (but **not** with Postgres 95.). And you can re-attach a partition if you want (again, only if you upgrade to Postgres 11 - and given your mentioning of performance problems that alone might already solve some of them) –  Jul 11 '19 at 12:42
  • Thanks, Do I need a DBA assistance for doing a partition ? – King_Fisher Jul 11 '19 at 12:59
  • we don't have plan yet to upgrade the postgresql. So, I have no choice, can you advice please for an alternate way in this case ? – King_Fisher Jul 16 '19 at 15:19
  • Is there any simpler way to do partitioning with RDS postgresql v11.1? – A l w a y s S u n n y Feb 11 '20 at 12:47
  • 2
    What if table A has foreign key dependency on table B but rows created at different years? – valpa Jul 01 '20 at 07:40
  • https://www.postgresql.org/docs/current/ddl-partitioning.html#DDL-PARTITIONING-DECLARATIVE-LIMITATIONS Check out this too before rush for it. – Siraj Alam Jan 15 '22 at 16:10
  • The old data in this table are only occasionally accessible to certain reports that users want, so I don’t want to pay for that unused data so I want to data that needs to be persisted indefinitely somewhere. – Giridharan Jun 15 '22 at 15:38
2

While you should no doubt upgrade your current version (I'd suggest moving away from the EDB system you are working on now, and going to community based Postgres 11) even if you can't upgrade, partitioning is still a much better answer than creating a second database.

By recreating your table as a set of partitions within the same database, you will be able to add/remove data in a much cleaner fashion, and it will make dealing with Vacuums much easier. Even in 9.5, you can take advantage of table inheritance to build out partitions by first adding partitions for incoming data, and then creating partitions at various intervals (probably monthly, since you want to run monthly cleanup) and moving the data into those partitions. This can be accomplished atomically with a series of INSERT INTO partition SELECT * FROM table WHERE <timestamp> style statements.

I suspect you can probably manage this yourself (you need basic sql and the ability to write simple triggers/functions... here is a link to the 9.5 docs), but if you need help, you can engage with one of the Postgres chat communities, or contact a support company if you want a deeper dive.

xzilla
  • 1,142
  • 9
  • 19