0

I promoted a Postgres 11.16 RDS cluster secondary to function as a stand-alone database. This caused a significant decrease in the RR’s performance. Digging deeper, when comparing the results of running Postgres EXPLAIN on a single production read replica (PROD), and the promoted read replica (RR), there is a 40x decline in performance on the RR. According to AWS there is no change in the environment so I started looking into what Postres is doing.

Here is a summary of an EXPLAIN query; notice the 40x jump in the Actual Total Time After Startup:

Database Estimated Startup Cost (units) Estimated Total Cost (units) Actual Startup Time (ms) Actual Total Time After Startup (ms)
PROD 329.96 29,132.77 18.624 23.968
RR 296.49 26,883.29 28.733 974.719

Part of the performance loss, about half, goes away after 24 hours.

Are there side effects to promoting a RR to a standalone DB? I.e., does Postgress need to do additional organization due to the change?

Toaster
  • 1,911
  • 2
  • 23
  • 43
  • 1
    could you please share the results from explain(analyze, verbose, buffers) for your statement(s) on both servers? (in plain text) And did you run analyze? – Frank Heikens Jun 01 '23 at 08:17
  • @FrankHeikens Thank you for your comments. I wasn’t aware of the buffers option for the explain command. This led me to do a better analysis pointing to a prewarming problem. It occurred to me that for our use case, we just want access to a completely stable read-only instance for an hour or two, so it might not be necessary to promote the read replica, which forces a restart. Do you know if pg_wal_replay_pause() or a similar approach is supported by AWS? – Toaster Jun 02 '23 at 06:58
  • 1
    I'm not familiar wilt pg_wal_replay_pause, but I do know that the extension pg_prewarm is supported by AWS. This could solve your issue – Frank Heikens Jun 02 '23 at 07:09

0 Answers0