51

Is there an embedded PostgreSql so that we could unit test our PostgreSql driven application?

Since PostgreSql has some dialects, it's better to use embedded PostgreSql itself than other embedded databases.

Embedded does not necessarily mean it must be embedded in the JVM process. It also does not necessarily need to use in-memory persistence. It should be loaded automatically by the dependency management (Maven, Gradle), so that Unit tests can run on every machine without having to install and configure a local PostgreSQL server.

Bastian Voigt
  • 5,311
  • 6
  • 47
  • 65
blue123
  • 2,937
  • 7
  • 27
  • 29

6 Answers6

55

There is an "embedded" PostgresSQL server that has been designed for unit testing from Java:

https://github.com/yandex-qatools/postgresql-embedded

Embedded postgresql will provide a platform neutral way for running postgres binary in unit tests. Much of the code has been crafted from Flapdoodle OSS's embed process

As an aside, there also exists similar projects for Mongo, Redis, Memcached and nodejs.

rbento
  • 9,919
  • 3
  • 61
  • 61
btiernay
  • 7,873
  • 5
  • 42
  • 48
  • 1
    why isn't this marked as the correct answer? is there a downside to this lib? does it have all the features of postgre? Can it be embelished with modules etc? – Radu Simionescu Mar 18 '16 at 15:26
  • 1
    I can testify that it works well. Standard Postgres. Example usage: https://github.com/icgc-dcc/dcc-submission/blob/develop/dcc-submission-loader/src/test/java/org/icgc/dcc/submission/loader/util/EmbeddedPostgres.java https://github.com/icgc-dcc/dcc-submission/blob/c5f6e38783fe164b35a63ac15dfb22fe9559d5f5/dcc-submission-loader/src/test/java/org/icgc/dcc/submission/loader/util/AbstractPostgressTest.java – btiernay Mar 18 '16 at 17:42
  • 1
    what 'embedded' means is actually quite unclear, but to me it is running in memory just like H2. and yandex-qatools/postgresql-embedded does not conform to this definition. in fact it downloads the specified postgres version, unpacks it to temporary dir and launches the server, all this leads to disk overhead, which is hardly acceptable in non enterprise environment. i'm judging of this sample project https://github.com/scottmf/postgresql-integration-test , which is still a good example. correct me if i'm wrong – Valya Oct 28 '16 at 04:41
  • 6
    A few members of my team implemented https://github.com/opentable/otj-pg-embedded, and it seems to be serving them well. It has cleaner interface when working with Spring configs. – dseibert Nov 06 '16 at 03:53
  • This solution has issues if you try to use it out of its default configuration (download package from different place etc.) – Jan Zyka Dec 12 '16 at 13:05
  • Note if running on a mac you may need to define the following environment variables, such that they are available both to your build and IDE:export LC_ALL="en_US.UTF-8" export LC_CTYPE="en_US.UTF-8" – murungu Oct 03 '17 at 12:16
  • @RaduSimionescu Because it's not, IMO, embedded PostgreSQL. It's a heavily modified/hacked up server based on PostgreSQL that may or may not work for a given purpose. – Craig Ringer Jan 28 '18 at 13:03
36

No, there is no embedded PostgreSQL, in the sense of an in-process-loadable database-as-a-library. PostgreSQL is process oriented; each backend has one thread, and it spawns multiple processes to do work. It doesn' make sense as a library.

The H2 database supports a limited subset of the PostgreSQL SQL dialect and the use of the PgJDBC driver.

What you can do is initdb a new temporary database, start it with pg_ctl on a randomized port so it doesn't conflict with other instances, run your tests, then use pg_ctl to stop it and finally delete the temporary database.

I strongly recommend that you run the temporary postgres on a non-default port so you don't risk colliding with any locally installed PostgreSQL on the machine running the tests.

(There is "embedded PostgreSQL in the sense of ecpg, essentially a PostgreSQL client embedded in C source code as preprocessor based C language extensions. It still requires a running server and it's a bit nasty to use, not really recommended. It mostly exists to make porting from various other databases easier.)

Craig Ringer
  • 307,061
  • 76
  • 688
  • 778
  • That's what I'd do as well. A batch file/shell script to setup Postgres without any installer is about 5 lines of code. Although doing an initdb on every unit test-run is probably going to be too slow. –  Jan 14 '13 at 07:19
  • @a_horse_with_no_name Yeah, you can always unzip a pre-initdb'd database or copy from a clean template that you initdb if it's missing during the build process. Painful when you upgrade Pg though. Personally I find initdb plenty fast enough, but I don't use spinning disks anymore. As for the batch file - the main challenge is making sure you use a non-conflicting port (especially if running concurrent tests) and making really sure you terminate the server when you're done. Especially on Windows, where you can't delete the datadir until the server is stopped. – Craig Ringer Jan 14 '13 at 07:46
  • 7
    You say "it doesn't make sense for a process oriented piece to be packaged as a library". Why not? You can automate IE nowadays using a Java library. Sure, processes will be spawned behind the scenes, but you can still talk comfortably to a Java library to achieve it. The same should be possible with PostgreSQL. You can get close to that goal by embedding a PostgreSQL installer, but there's a lot of boilerplate just to set it up like that. – Mihai Danila Aug 13 '13 at 21:47
  • @MihaiDanila The key difference is that PostgreSQL is writing to shared, critical data files. Process management must be _exactly_ right. What you describe could be achieved with a tool that automates pg_ctl from Java, and that'd certainly be useful, but there's no need or benefit to bringing the PostgreSQL server its self in-process. – Craig Ringer Aug 13 '13 at 23:46
  • 1
    Can PostgreSQL be started on the fly whether or not it has been previously installed on the machine? Perhaps that's all that the OP needs. I know that's what I would need. I would not want for my unit tests to depend on PostgreSQL being pre-installed, because this would mean depending on the correct functioning of processes in a different team, specifically, the team that maintains the TeamCity agents. Plus, I would expect the fresh process approach to allow full server configurability, something generally difficult in a pre-existing installation (probably as root). – Mihai Danila Aug 14 '13 at 15:21
  • 2
    @MihaiDanila Absolutely, and that's what I would recommend. Just bundle the .zip of the Pg binaries, and launch Pg using `pg_ctl` on demand. You can edit the configs after `initdb` and with `initdb` arguments, control things like ports via environment variables or args to `pg_ctl`, etc. – Craig Ringer Aug 15 '13 at 00:49
  • Cool. And if all of the details of starting `pg_ctl` and `initdb` could be hidden behind a connection string URL, similar to the (now retired) MySQL MXJ connector library, that would be ideal. – Mihai Danila Aug 15 '13 at 16:32
  • This is not true: https://github.com/yandex-qatools/postgresql-embedded – gtrak Jan 12 '18 at 15:45
31

I tried the project suggested by @btiernay (yandex-qatools). I spent a good few days with this and without any offence it's over engineered solution which doesn't work in my case as I wanted to download the binaries from internal repository rather than going to public internet. In theory it supports it but in fact it doesn't.

OpenTable Embedded PostgreSQL Component

I ended up using otj-pg-embedded and it works like a charm. It was mentioned in comments so I thought I'll mention it here as well.

I used it as standalone DB and not via rule for both unit tests and local development.

Dependency:

<dependency>
    <groupId>com.opentable.components</groupId>
    <artifactId>otj-pg-embedded</artifactId>
    <version>0.7.1</version>
</dependency>

Code:

@Bean
public DataSource dataSource(PgBinaryResolver pgBinaryResolver) throws IOException {
    EmbeddedPostgres pg = EmbeddedPostgres.builder()
        .setPgBinaryResolver(pgBinaryResolver)
        .start();


    // It doesn't not matter which databse it will be after all. We just use the default.
    return pg.getPostgresDatabase();
}

@Bean
public PgBinaryResolver nexusPgBinaryResolver() {
    return (system, machineHardware) -> {
        String url = getArtifactUrl(postgrePackage, system + SEPARATOR + machineHardware);
        log.info("Will download embedded Postgre package from: {}", url);

        return new URL(url).openConnection().getInputStream();
    };
}

private static String getArtifactUrl(PostgrePackage postgrePackage, String classifier) {
    // Your internal repo URL logic
}
Basil Bourque
  • 303,325
  • 100
  • 852
  • 1,154
Jan Zyka
  • 17,460
  • 16
  • 70
  • 118
  • 1
    Is it possible to create schema from `.sql` file in `otj-pg-embedded`. I am not using spring. I did not find any example without `Rule`. – tuk Jan 09 '18 at 09:59
  • 2
    See dzone.com article: [*Using Embedded PostgreSQL Databases For Testing*](https://dzone.com/articles/using-embedded-postgresql-databases-for-unit-testi) – Basil Bourque Jun 25 '18 at 04:31
  • 1
    Been using otj-pg-embedded with great success for years. Works like a charm – Bastian Voigt Feb 28 '19 at 07:01
4

You can use a container instance of PostgreSQL.

Since spinning a container is a matter of seconds, this should be good enough for unittests. Moreover, in case you need to persist the data, e.g. for investigation, you don't need to save the entire container, only the data files, which can be mapped outside of the container.

One of example of how to do this can be found here.

Arik
  • 1,257
  • 9
  • 13
4

I am using the container instance of PostgreSQL in the tests. https://www.testcontainers.org/#about https://www.testcontainers.org/modules/databases/jdbc/

dependencies:

        <dependency>
            <groupId>org.junit.jupiter</groupId>
            <artifactId>junit-jupiter-api</artifactId>
            <version>5.7.2</version>
            <scope>test</scope>
        </dependency>
        <dependency>
            <groupId>org.junit.jupiter</groupId>
            <artifactId>junit-jupiter-params</artifactId>
            <version>5.7.2</version>
            <scope>test</scope>
        </dependency>
        <dependency>
            <groupId>org.junit.jupiter</groupId>
            <artifactId>junit-jupiter-engine</artifactId>
            <version>5.7.2</version>
            <scope>test</scope>
        </dependency>
        <dependency>
            <groupId>org.testcontainers</groupId>
            <artifactId>testcontainers</artifactId>
            <version>1.15.3</version>
            <scope>test</scope>
        </dependency>
        <dependency>
            <groupId>org.testcontainers</groupId>
            <artifactId>junit-jupiter</artifactId>
            <version>1.15.3</version>
            <scope>test</scope>
        </dependency>
        <dependency>
            <groupId>org.testcontainers</groupId>
            <artifactId>postgresql</artifactId>
            <version>1.15.3</version>
            <scope>test</scope>
        </dependency>

And do the tests:

@SpringBootTest
@ActiveProfiles({"test"})
@Testcontainers
class ApplicationTest {
    @Container
    static PostgreSQLContainer<?> postgreSQL = new PostgreSQLContainer<>("postgres:12.7");

    @DynamicPropertySource
    static void postgreSQLProperties(DynamicPropertyRegistry registry) {
        registry.add("spring.datasource.username", postgreSQL::getUsername);
        registry.add("spring.datasource.password", postgreSQL::getPassword);
    }


    @Test
    void someTests() {

    }

in application-test.yml:

source:
  datasource:
    url: jdbc:tc:postgresql:12.7:///databasename
firstpostcommenter
  • 2,328
  • 4
  • 30
  • 59
3

If you are looking to run an in-process version of postgres from an Integration (or similar) test suite, the postgresql-embedded worked fine for me.

I wrote a small maven plugin that can be used as a maven wrapper around a forked version of postgresql-embedded.

aramcodez
  • 102
  • 1
  • 8
  • Again, be warned this solution has issues if you step outside of the default configuration – Jan Zyka Dec 13 '16 at 10:53
  • Fair enough. But, in the maven plugin, I specifically added a configuration option for setting the download URL for binaries to a location other than the default. – aramcodez Dec 13 '16 at 20:34
  • Please checkout https://github.com/aramcodz/embedded-postgres-maven-plugin and look for the "downloadUrl" plugin configuration parameter. – aramcodez Dec 13 '16 at 20:35
  • Did you try that? Because the underlying library has bugs. No plugin can fix this - the library itself has the bug. Try to clean local cache of postgre binaries, downloadnfrom local url (file:///...) while being disconnected from internet. I bet it won't start up (at least it was failing for me and I debugged it to a point where I clearly saw a bug which ignores any settings passed to initDb command) – Jan Zyka Dec 14 '16 at 08:04
  • @JanZyka, Yes, I just tried it and found that the _download from local_ succeeded. I cleaned the local cache of binaries (moved to another local folder (away from ~/.embedpostgresql), and used the downloadUrl of file:////my_local_folder/dev/tmp/ in the maven config. Here is a snippet of the output: 'Download Version{9.2.4-1}:OS_X:B64 START Download Version{9.2.4-1}:OS_X:B64 DownloadSize: 64139929 Download Version{9.2.4-1}:OS_X:B64 0% 1%...94% 95% 96% 97% 98% 99% 100% Download Version{9.2.4-1}:OS_X:B64 downloaded with 62636kb/s Download Version{9.2.4-1}:OS_X:B64 DONE' – aramcodez Dec 14 '16 at 14:25
  • That's interesting (really). I should have said my problem was that the process in fact run 3 processes: `Postgres` and this one runs 2 subprocesses as part of the `beforeExecute()` and `afterExecute()`: `initDb` and `createDb`. In my case the first download succeeds but the other two don't inherit the download config. Key to reproduce the bug is also changing the temp binary folder to other place. But well ... if it works for you I might have done some mistake. Not convinced yet though :) – Jan Zyka Dec 14 '16 at 14:55