6

I have PostgreSQL database. It is used for unit testing.

  • I want to speed the tests up so I want to use some kind of in-memory databases (e.g. H2).
  • I want to dump the database (from PostgreSQL) and the import such a dump into in-memory database.

Do you have any suggestion concerning in-memory database choice? I want that database to be compatible with PostgreSQL.

Jonas G. Drange
  • 8,749
  • 2
  • 27
  • 38
Lucas Smith
  • 231
  • 3
  • 9
  • Have you verified (by measuring) that the bottleneck is your database server's disk IO ? Do you have an idea how much you can improve your test throughput by switching to in-memory database ? – user272735 Aug 01 '12 at 19:50
  • Which operating system is used for development and testing? – Tometzky Aug 02 '12 at 09:35

3 Answers3

4

I'd simply create a database directory (called a cluster in PostgreSQL) in tmpfs (essentially a RAM-disk - /dev/shm is configured as such in most Linux distributions) and simply run postgres there on non-standard port, for example like this:

initdb -D /dev/shm/pgtest
postmaster -D /dev/shm/pgtest -p 11111
Tometzky
  • 22,573
  • 5
  • 59
  • 73
  • 1
    this solution is not portable. each member of your team will have to install postgres. CI administrator will have to install postgres. and what if it was a heavyweight database (e.g. oracle)? just no. – piotrek Aug 02 '12 at 06:51
  • Of course it is not portable. I think installing Postgres isn't too much burden, as most Linux distributions have it packaged. And testing against database other than used in production is not very effective. – Tometzky Aug 02 '12 at 09:35
  • Just a nitpick: You're mixing your terminology. Pg calls a set of databases created by initdb and managed by a postmaster a "cluster" (unfortunate choice of term, but oh well). A [tablespace](http://www.postgresql.org/docs/9.1/static/manage-ag-tablespaces.html) is a subdirectory of the datadir that lets you move some tables, indexes, etc to a different disk. – Craig Ringer Aug 02 '12 at 12:01
  • @CraigRinger: **PLEASE DON'T!** Never put a tablespace on a ramdisk on any other unstable storage. After reboot, when this tablespace files will be gone or corrupted, **the whole database would not start**. – Tometzky Aug 05 '12 at 21:45
  • @Tometzky Good point, it really needs that caveat. Removed. – Craig Ringer Aug 05 '12 at 23:28
  • you forgot to mention that `initdb` is not installed by default. `postgres-xc` package is needed on debian. I still can't find it on http://apt.postgresql.org – brauliobo Feb 10 '15 at 17:42
3

I would recommend using HSQL with a spring combination. I was brought on to my current occupation to do this exact thing, and even though it is a headache, it can be done. And based on quick research, it appears hsql is compatible with PostgreSQL. Let me know if you have any other questions.

nook
  • 2,378
  • 5
  • 34
  • 54
  • Could you tell me why you think it is a headache? Is there Spring necessary? What about H2? – Lucas Smith Aug 01 '12 at 18:42
  • 1
    That `set database sql syntax PGS statement` feature is AWESOME. It's clearly still not as good as testing directly against Pg, but it's very impressive. Thanks for the pointer. – Craig Ringer Aug 02 '12 at 12:06
0

Do you need to drop Pg for this?

If you're unit testing against a different database than you run against in production your tests just won't be as good.

Try this: Optimise PostgreSQL for fast testing .

... and see how you go.

update: You might think putting everything on a ramdisk will make your tests massively faster, but you're probably mistaken. It's often more efficient to let data that isn't in use spill to disk so the system can use more RAM for the task at hand. Using unlogged tables and proper async commit or (for testing only, will eat your data) fsync=off should get you very similar speeds.

If you really require a true in-memory tables, create a file system on a ramdisk or tmpfs and initdb a new cluster in there. You might as well use UNLOGGED tables so they don't go through the write-ahead log.

Note that while tmpfs is a ramdisk, it allows less-used pages to go to swapspace. If you absolutely must have all the data pinned in RAM for some reason, you'll need to use the ramdisk driver and create a file system on a ramdisk instead. My advice: don't do it. Just configure Pg properly for fast testing.

Community
  • 1
  • 1
Craig Ringer
  • 307,061
  • 76
  • 688
  • 778
  • I have to make a dump of a PostgreSQL database (schema and some dictionary values). Next, I have to import such dump into in-memory database and then perform all of the unit tests. How can I achieve that? – Lucas Smith Aug 04 '12 at 10:16
  • @LucasSmith I'm asking *why* the in-memory database is a requirement. Is it purely a performance concern, or is there some other reason? If it's purely performance, a properly configured Pg should run very nearly as fast as an in-memory database when it is working on a data set that fits easily in memory. Trying to go all in-memory may be premature or misdirected optimisation. – Craig Ringer Aug 05 '12 at 03:55