43

I'm going to store large amount of data (logs) in fragmented PostgreSQL tables (table per day). I would like to compress some of them to save some space on my discs, but I don't want to lose the ability to query them in the usual manner.

Does PostgreSQL support such a transparent compression and where can I read about it in more detail? I think there should be some well-known magic name for such a feature.

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
Prikrutil
  • 2,610
  • 3
  • 18
  • 12
  • Also not specifically what you asked for but may be of some help with large data sets is that PostgreSQL supports table inheritance, allowing you to partition your data, for example, by date ranges. Or some other logic. – Eric Sep 14 '09 at 16:39

2 Answers2

58

Yes, PostgreSQL will do this automatically for you when they go above a certain size. Compression is applied at each individual data value though - not at the full table level. Meaning that if you have a billion rows that are very narrow, they won't get compressed. Or if you have very many columns each with only a small value in it, they won't get compressed. Details about this scheme in the manual.

If you need it on the full table level, a solution is to create a TABLESPACE for those tables that you want compressed, and point it to a compressed filesystem. As long as the filesystem still obeys fsync() and standard POSIX semantics, this should be perfectly safe. Details about this in the manual.

pd40
  • 3,187
  • 3
  • 20
  • 29
Magnus Hagander
  • 23,890
  • 5
  • 56
  • 43
  • There are no plans to introduce a page compression option? This would be very helpful for exactly the scenario you are describing. Sure you can use btrfs or zfs, but these have their own drawbacks for heavy io files, especially if compressed. – jjxtra Aug 25 '22 at 17:19
  • It would be especially useful and simple to not repeatedly store the same prefix value in an index slice: eg. In and index over (parent_id, child_id) don't repeat parent_id for each child with the same parent, but just perhaps once per page. I wouldn't even call that compression, but unfortunately that's not how it's done. – John Sep 29 '22 at 17:33
  • Using a compressed file system is a crutch: It means that what Postgres thinks a page doesn't fit whole blocks on the FS any more, leading to another form of fragmentation. – John Sep 29 '22 at 17:36
12

Probably not what you have in mind but still useful info - Chapter 53. Database Physical Storage of the fine manual. The TOAST section warrants further attention.

Milen A. Radev
  • 60,241
  • 22
  • 105
  • 110