Difference between Stream Replication and logical replication

Question

Could anybody tell me more about difference between physical replication and logical replication in PostgreSQL?

Craig Ringer · Accepted Answer · 2022-11-29T06:03:43.273

TL;DR: Logical replication sends row-by-row changes, physical replication sends disk block changes. Logical replication is better for some tasks, physical replication for others.

Note that in PostgreSQL 12 (current at time of update) logical replication is stable and reliable, but quite limited. Use physical replication if you are asking this question.

Streaming replication can be logical replication. It's all a bit complicated.

WAL-shipping vs streaming

There are two main ways to send data from master to replica in PostgreSQL:

WAL-shipping or continuous archiving, where individual write-ahead-log files are copied from pg_xlog by the archive_command running on the master to some other location. A restore_command configured in the replica's recovery.conf runs on the replica to fetch the archives so the replica can replay the WAL.

This is what's used for point-in-time replication (PITR), which is used as a method of continuous backup.

No direct network connection is required to the master server. Replication can have long delays, especially without an archive_timeout set. WAL shipping cannot be used for synchronous replication.
streaming replication, where each change is sent to one or more replica servers directly over a TCP/IP connection as it happens. The replicas must have a direct network connection the master configured in their recovery.conf's primary_conninfo option.

Streaming replication has little or no delay so long as the replica is fast enough to keep up. It can be used for synchronous replication. You cannot use streaming replication for PITR¹ so it's not much use for continuous backup. If you drop a table on the master, oops, it's dropped on the replicas too.

Thus, the two methods have different purposes. However, both of them transport physical WAL archives from primary to replica; they differ only in the timing, and whether the WAL segments get archived somewhere else along the way.

You can and usually should combine the two methods, using streaming replication usually, but with archive_command enabled. Then on the replica, set a restore_command to allow the replica to fall back to restore from WAL archives if there are direct connectivity issues between primary and replica.

Asynchronous vs synchronous streaming

On top of that, there's synchronous and asynchronous streaming replication:

In asynchronous streaming replication the replica(s) are allowed to fall behind the master in time when the master is faster/busier. If the master crashes you might lose data that wasn't replicated yet.

If the asynchronous replica falls too far behind the master, the master might throw away information the replica needs if max_wal_size (was previously called wal_keep_segments) is too low and no slot is used, meaning you have to re-create the replica from scratch. Or the master's pg_wal(waspg_xlog) might fill up and stop the master from working until disk space is freed if max_wal_size is too high or a slot is used.
In synchronous replication the master doesn't finish committing until a replica has confirmed it received the transaction². You never lose data if the master crashes and you have to fail over to a replica. The master will never throw away data the replica needs or fill up its xlog and run out of disk space because of replica delays. In exchange it can cause the master to slow down or even stop working if replicas have problems, and it always has some performance impact on the master due to network latency.

When there are multiple replicas, only one is synchronous at a time. See synchronous_standby_names.

You can't have synchronous log shipping.

You can actually combine log shipping and asynchronous replication to protect against having to recreate a replica if it falls too far behind, without risking affecting the master. This is an ideal configuration for many deployments, combined with monitoring how far the replica is behind the master to ensure it's within acceptable disaster recovery limits.

Logical vs physical

On top of that we have logical vs physical streaming replication, as introduced in PostgreSQL 9.4:

In physical streaming replication changes are sent at nearly disk block level, like "at offset 14 of disk page 18 of relation 12311, wrote tuple with hex value 0x2342beef1222....".

Physical replication sends everything: the contents of every database in the PostgreSQL install, all tables in every database. It sends index entries, it sends the whole new table data when you VACUUM FULL, it sends data for transactions that rolled back, etc. So it generates a lot of "noise" and sends a lot of excess data. It also requires the replica to be completely identical, so you cannot do anything that'd require a transaction, like creating temp or unlogged tables. Querying the replica delays replication, so long queries on the replica need to be cancelled.

In exchange, it's simple and efficient to apply the changes on the replica, and the replica is reliably exactly the same as the master. DDL is replicated transparently, just like everything else, so it requires no special handling. It can also stream big transactions as they happen, so there is little delay between commit on the master and commit on the replica even for big changes.

Physical replication is mature, well tested, and widely adopted.

logical streaming replication, new in 9.4, sends changes at a higher level, and much more selectively.

It replicates only one database at a time. It sends only row changes and only for committed transactions, and it doesn't have to send vacuum data, index changes, etc. It can selectively send data only for some tables within a database. This makes logical replication much more bandwidth-efficient.

Operating at a higher level also means that you can do transactions on the replica databases. You can create temporary and unlogged tables. Even normal tables, if you want. You can use foreign data wrappers, views, create functions, whatever you like. There's no need to cancel queries if they run too long either.

Logical replication can also be used to build multi-master replication in PostgreSQL, which is not possible using physical replication.

In exchange, though, it can't (currently) stream big transactions as they happen. It has to wait until they commit. So there can be a long delay between a big transaction committing on the master and being applied to the replica.

It replays transactions strictly in commit order, so small fast transactions can get stuck behind a big transaction and be delayed quite a while.

DDL isn't handled automatically. You have to keep the table definitions in sync between master and replica yourself, or the application using logical replication has to have its own facilities to do this. It can be complicated to get this right.

The apply process its self is more complicated than "write some bytes where I'm told to" as well. It also takes more resources on the replica than physical replication does.

Current logical replication implementations are not mature or widely adopted, or particularly easy to use.

Too many options, tell me what to do

Phew. Complicated, huh? And I haven't even got into the details of delayed replication, slots, max_wal_size, timelines, how promotion works, Postgres-XL, BDR and multimaster, etc.

So what should you do?

There's no single right answer. Otherwise PostgreSQL would only support that one way. But there are a few common use cases:

For backup and disaster recovery use pgbarman to make base backups and retain WAL for you, providing easy to manage continuous backup. You should still take periodic pg_dump backups as extra insurance.

For high availability with zero data loss risk use streaming synchronous replication.

For high availability with low data loss risk and better performance you should use asynchronous streaming replication. Either have WAL archiving enabled for fallback or use a replication slot. Monitor how far the replica is behind the master using external tools like Icinga.

References

Does this mean we can just hire you to perform the initial PostgreSQL replication setup? ;) — Joshua Burns, Aug 07 '17 at 03:09
@JoshuaBurns I'm busy with development, but others at https://2ndquadrant.com do indeed assist with such setups. More generally there's https://www.postgresql.org/support/professional_support/. — Craig Ringer, Aug 07 '17 at 03:14
Understood, seems we all tend to be a little too busy :) I took a peek at 2ndquadrant, I'll reach out to them tomorrow. Thanks for the ridiculously quick reply. — Joshua Burns, Aug 07 '17 at 03:17
It's been a few years since this answer and logical replication has come a long way. Any changes in your recommendations? — mlissner, Nov 09 '18 at 07:19
@mlissner You should still default to using physical replication. It's simpler and easier. Use logical replication if you need the added power. — Craig Ringer, Nov 09 '18 at 15:33
Is it implied that both logical and physical replication are referring to streaming replication? So is the WAL shipping physical or logical? — Franklin Yu, Dec 04 '18 at 04:00
@FranklinYu WAL shipping is physical i.e. concerned with raw WAL transport. — Craig Ringer, Dec 04 '18 at 11:57
From these descriptions it seems AWS uses _asynchronous physical streaming replication_ for it's read replicas according to this article by citusdata they use Replicated block device (like san replication) which I guess must be for their multi-AZ solution? https://www.citusdata.com/blog/2018/02/21/three-approaches-to-postgresql-replication/ Another system I've seen that uses logical replication with the WAL2JSON plugin is Debezium, which replicates to Kafka. Logical replication is therefore a good option when your target is not Postgres. — Davos, Feb 19 '20 at 14:33
>If the asynchronous replica falls too far behind the master, the master might throw away information the replica needs if wal_keep_segments is too low and no slot is used. -- What if wal_keep_segments is 0 (default) and a replication slot is in use? — Awesome-o, Mar 27 '21 at 00:33
@Awesome-o Then a replication slot is in use, so WAL is retained indefinitely. However, in PostgreSQL 13 there is now a `max_slot_wal_keep_size` option that sets a limit for slot WAL retention. Consider it an advanced feature, you probably don't need it. Also note that `wal_keep_segments` was replaced with the `max_wal_size` setting. — Craig Ringer, Mar 27 '21 at 12:44
In the [official documentation](https://www.postgresql.org/docs/15/different-replication-solutions.html#HIGH-AVAILABILITY-MATRIX) the built-in streaming replication is listed as an *example* of Write-Ahead Log Shipping, which seems to directly contradict the “vs”-comparison in this answer, which confuses me even more, because I don't understand which aspects can be combined and which cannot in general. — Abrißbirne 66, Nov 24 '22 at 20:28
@Abrißbirne66 "streaming replication" and "replicating using archive_command and restore_command (log shipping)" are two ways to do WAL-archive based replication. Both transport the WAL archives, but they do so different ways. They can be combined, and postgres will switch from one to the other in case of connectivity issues. — Craig Ringer, Nov 29 '22 at 06:01
@CraigRinger is it possible to use log shipping in logical replication ? streaming method is possible in both physical and logical, i understood that but can is wal logs for logical can be sent to some shared location & used ? — Yusuf, Sep 02 '23 at 17:07
@CraigRinger Also i have to appreciate this answer, going over many places for many days but always confused, finally understood the types & methods of replication. — Yusuf, Sep 02 '23 at 17:08