Deploying a high availability Postresql 9.0 on Amazon EC2 with PGPool-ii

Question

We have an existing web application that uses Postgresql 9.0 and PGPool-ii. I am thinking of migrating our infrastructore to Amazon EC2 and was inspired by the following link: http://aws.typepad.com/aws/2008/12/running-everything-on-aws-soocialcom.html that uses a similar architecture.

Since Amazon RDS doesn't support PGSQL, we are going to stick with PGPool-ii to load-balance the queries on the different DB servers and keep them synchronzed between each others.

So we plan to deploy 3 frontend web servers which will contain the following : - Web Server + PHP code - PGPool-ii

Then, we would have 2 database servers on separate Amazon instances with only PGSQL. These 2 PG servers would be used by the PGPools located on the 3 frontend servers.

My question is that I don't know if this solution is reliable enough as multiple PGPool will access multiple PGSQL servers. Most examples of PGPool demonstrates a single PGPool that uses N underlying PGSQL servers. Is it a good pratice to deploy a PGPool instance on each web server ?

If not, is there any other/better architecture to avoid having SPOF using Amazon ?

Thank you very much for your replies.

organicveggie · Answer 1 · 2011-09-14T02:58:40.120

Couple of thoughts. First, we avoid SPOF for things like PGPool through the use of Heartbeat, Pacemaker and an ElasticIP. Run two (or more) instances dedicated to PGPool. Assign an ElasticIP to one of them. Setup Heartbeat and Pacemaker to monitor PGPool. On failover, have Pacemaker run a script that assigns the ElasticIP to new master (DC in Pacemaker terms). If you're only running two nodes, make sure that you disable quorum functionality in Pacemaker, because you can't have a quorum if one node goes down out of a total of two nodes.

To take advantage of the ElasticIP, do a reverse DNS lookup on your ElasticIP from outside of Amazon. This will give you a DNS name that maps to the ElasticIP which should end in amazonaws.com. DNS lookups from an EC2 instance for a domain name ending in amazonaws.com will actually resolve to the internal IP address for the instance that has been assigned the ElasticIP. You can either point your application servers directly at the DNS for the ElasticIP or, assuming you're running your own DNS, you can create a CNAME that refers to the ElasticIP DNS.

That said, there's one big catch to using ElasticIPs for failover. Re-assigning the ElasticIP takes up to 120 seconds to take effect. Most of the time is spent waiting fo thte change to propagate through Amazon's DNS servers.

Also, while I have not tried running PGPool-ii on each Application Server, I'm not sure this would work. If the master database fails, I think each of the PGPool instances would be competing to handle the failover. Maybe I'm just not familiar enough with PGPool-ii to understand the best way to handle that.

As far as the person who mentioned plproxy, I think they have it confused with PGBouncer, which is recommend for use with plproxy. plproxy is a partitioning system, not a load balancer. That said, PGBouncer is not a load balancer either - it's a connection pooling system. PGBouncer does not provide load balancing functionality. In fact, the FAQ for PGBouncer explicitly recommends using a TCP load balancer like HAProxy.

In addition, the statements about Amazon having vertical scalability problems that Rackspace solves are incorrect. With Amazon EC2 instances you can always stop an instance and upgrade it to a larger instance type. Neither Amazon nor Rackspace support changing instance types on the fly.

score 1 · Answer 2 · answered Aug 04 '11 at 15:54

1

Though, I do not have a clear idea on pgPool I have been doing a lot of research on the scalability areas and ignored pgPool for some reason that I don't remember now.

I would suggest taking a look at plproxy. This offers a load balanced approach.

Also I wouldn't be a heavy buyer on Amazon because of vertical scalability problems with Amazon. You do not get an out of the box upgrade when you want to increase a server's configuration. So you will end up implementing all your server setup again if you upgrade to a higher instance.

That way Rackspace was convincing where you can just ask them to upgrade from 1 GB ram to 2 GB or more and it will be done with just a restart of your instance.

Both Amazon and Rackspace offer (99%) reliable hosting solution and the rest 1% we have to take note of with proper backup and distribution into different regions etc.,

answered Aug 04 '11 at 15:54

Muthu

2,675
4
28
34

1

As I mentioned in my answer, you make several incorrect statements. First, plproxy is not a load balancer for Postgres - it's a partitioning system. Second, Amazon allows you stop an instance and upgrade it to a larger size. Rackspace is not unique in this regard. – organicveggie Sep 14 '11 at 02:53
Request you to please read my answer again. I mentioned "plproxy offers a load balanced approach" and I haven't mentioned that it is a load balancer. Skype has used this approach, though I can't place the link right away. Also the problem I mentioned about amazon is that you'll have to do the entire setup if you want to increase your capacity to a larger instance (for example from small instance to large instance etc.,). If you know a better way where the vertical upgrade is seamless without copying or snapshot restore, please do let me know. – Muthu Sep 26 '11 at 02:35
Sorry, I did indeed misread your statement. Although plproxy wouldn't be a good fit for my organization as an alternative to a more traditional load balanced approach, I can see that it's an option for some people. As far as vertically scaling an Amazon EC2 instance goes, you can do this trivially without copying or using snapshot. In fact, it's built right into the Management Console. If you stop an EC2 instance, you can change the instance type. The only caveat, which applies to Rackspace as well, is that you cannot go from a 32-bit instance to a 64-bit instance or the reverse. – organicveggie Oct 31 '11 at 23:50
Aah, how careless I was. Have been using amazon for months and haven't noticed the option just because it was buried in the context menu! Thanks for your reply! Some learning today :) – Muthu Nov 01 '11 at 00:51

Deploying a high availability Postresql 9.0 on Amazon EC2 with PGPool-ii

2 Answers2