Postgres and multiple locations of data storage

Question

Postgres and the default location for its storage is at my C-drive. I would like to restore a backup to another database but to access it via the same Postgres server instance - the issue is that the size of the DB is too big to be restore on the same c-drive ...would it be possible to tell Postgres that the second database should be restore and placed on another location/drive (while still remaining the first one)? Like database1 at my C-drive and database2 at my D-drive?

Otherwise the second best solution would be to install 2 separate Postgres instances - but that also seems a bit overkill?

score 4 · Answer 1 · answered Nov 11 '19 at 05:04

4

That should be entirely achievable, if you've used the postgres pg_dump command.

The pg_dump command does not create the database, so you create it yourself first. Use CREATE TABLESPACE to specify the location.

CREATE TABLESPACE secondspace LOCATION 'D:\postgresdata';

CREATE DATABASE seconddb TABLESPACE secondspace;

This creates an empty database on the D: drive.

Then the standard restore from a pg_dump should work:

psql seconddb < dumpfile

answered Nov 11 '19 at 05:04

GregHNZ

7,946
1
28
30

I get the following error when I run your first command: SQL Error [42501]: ERROR: could not set permissions on directory "/media/tb/UBUNTU_2/Postgres": Operation not permitted ...I google and also set the permission of the folder via "sudo chmod -R 777 /media/tb/UBUNTU_2/Postgres" but it still complains (I'm not running SELinux - but regular Ubuntu 18.04) – PabloDK Nov 12 '19 at 12:25
Does this help? https://stackoverflow.com/questions/5208094/creating-a-tablespace-in-postgresql - if not, what filesystem type is on that drive? – GregHNZ Nov 13 '19 at 04:52
"sudo chmod -R 777 /media/tb/UBUNTU_2/Postgres" shouls already give all user permission to everyting - right? And it doesn't work...so I guess it is due to something else? File system is Linux EXT4 – PabloDK Nov 13 '19 at 08:24

Basil Bourque · Answer 2 · 2019-11-11T04:54:46.173

Replication

Sounds like you need database replication.

There are several ways to do this with Postgres, one built-in, and other approaches using add-on libraries.

Built-in replication feature

The built-in replication feature is likely to suit your needs. See the manual. In this approach, you have an instance of Postgres running on your primary server, doing reads and writes of your data. On a second server, an entirely separate computer, you run another instance of Postgres known as the replica. You first set up the replica by doing a full backup of your database on the first server, and restore to the second server.

Next you configure the replication feature. The replica needs to know it is playing the role of a replica rather than a regular database server. And the primary server needs to know the replica exists, so that every database change, every insert, modification, and deletion, can be communicated.

WAL

This communication happens via WAL files.

The Write-Ahead Log (WAL) feature in Postgres is where the database writes all changes first to the WAL, and only after that is complete, then writes to the actual database. In case of crash, power outage, or other failure, the database upon restarting can detect a transaction left incomplete. If incomplete, the transaction is rolled back, and the database server can try again by seeing the "To-Do" list of work listed in the WAL.

Every so often the current WAL is closed, with a new WAL file created to take over the work. With replication enabled, the closed WAL file is copied to the replica. The replica then incorporates that WAL file, to follow the same "To-Do" list of changes as written in that WAL file. So all changes are made to the replica database exactly as they were made to the primary database. Your replica is an exact match to the primary, except for a slight lag in time. The replica is always just one WAL file behind the progress of the primary.

In times of trouble, the replica serves as a warm stand-by. You can shutdown the primary, then tell the replica that it is now the primary. You can even configure the replica to be a hot stand-by, meaning it will automatically take-over when the primary seems to have failed. There are pros and cons to hot stand-by.

Offload read-only queries

As a bonus feature, the replica can be used for read-only queries. If your database is heavily used, you can offload some of the work burden from your primary to the replica. Any queries that do not require the absolute latest information can be shifted by connecting to the replica rather than the original. For example, a quarterly sales report likely does not need the latest data stored in the active WAL file that has not yet arrived on the replica.

Physical replication means all databases are copied

Caveat: This built-in replication feature is physical replication. This means all the changes to the entire Postgres installation (formally known as a cluster, not to be confused with a hardware cluster) is copied to the replica. If you use one Postgres server to server multiple databases, all those databases must be replicated – you cannot pick and choose which get copied over. There may be alternative replication features in the future related to logical replication.

More to learn

I am being brief here. The topics of replication, high-availability, and disaster-recovery are broad and complex, too much for an Answer on Stack Overflow.

Tip: This kind of Question might have been better asked on the sister site, DBA.StackExchange.com.

I think replication is way too much here. A tablespace on the second drive, with the target database using that tablespace is a lot easier to achieve as PabloDK already has a dump to be imported — , Nov 11 '19 at 06:56