1

I have created a cluster consists of three RabbitMQ nodes using join_cluster command.

i.e.

rabbitmqctl –n rabbit2@MYPC1 join_cluster rabbit2@MYPC1 

(currently the cluster runs on a single computer)

Questions:
In the documents it says there is one implemetation for active passive and one for active active.

  1. What did I configure?
  2. How do I know?
  3. How can it be changed?
  4. Is there a big performance trade off between Active Active & Active Passive?
  5. What is the best practice to interact with active/active?
    i.e. install a load balancer? apache that will round robin
  6. What is the best practice to interact with active/passive?
    if I interact with only the active - this is a single point f failure

Thanks.

Community
  • 1
  • 1
Bick
  • 17,833
  • 52
  • 146
  • 251

1 Answers1

9

I have been doing some research into availability options with RabbitMQ and while I am still fairly new, I'll attempt to answer your questions with the knowledge I do have. Please understand that these answers are not intended to be comprehensive.

Before getting to the questions and answers, I think it's worth pointing out that I think using the terms Active/Active and Active/Passive in the context of a cluster running on a single computer does not really apply. Active/Active and Active/Passive are typically terms used to describe highly available clusters where you have a system of more than one logical server (in your case, multiple RabbitMQ clusters), shared/redundant storage, network capabilities, power, etc.

  1. What did I configure?
    Without any load balancing for the nodes in your cluster or queue mirroring you have neither, meaning you do not have a highly available cluster.
  2. How do I know?
    RabbitMQ does not provide any connection management so traffic with a failed node will not automatically be passed on to a different node, which is required for an active/active cluster. Without queue mirroring you do not have fully redundant nodes in your cluster, which is required for active/passive.
  3. How can it be changed?
    Even if you implement load balancing and/or queue mirroring you are missing a number of requirements to offer a highly-available RabbitMQ cluster. Primarily, with a RabbitMQ cluster you only have a single logical broker (at least two are required for an HA cluster).
  4. Is there a big performance trade off between Active Active & Active Passive?
    I think you will start seeing performance penalties as you start introducing data replication and/or redundancy, which would affect both Active/Active and Active/Passive. If you are using synchronous data replication then you will see a bigger performance hit than if you replicate data asynchronously. There's a lot more to it, but to me this feels like there may be a bigger performance hit by using Active/Active but this depends heavily on how fast all of the pieces are working together. In Active/Passive where you may be using asynchronous replication across servers your performance may appear better but in a failover situation you would need to wait for that replication to complete before you can switch to your secondary server.
  5. What is the best practice to interact with active/active? i.e. install a load balancer? apache that will round robin
    RabbitMQ recommends using a load balancer so that you do not have to leak details about the nodes in your cluster to the clients.
  6. What is the best practice to interact with active/passive? if I interact with only the active - this is a single point of failure
    It is a point of failure but with Active/Passive you can implement a failure strategy to retry the next available server or all remaining servers. With these strategies in place you can establish a scenario where the capabilities of your cluster are merely degraded while a failover is happening instead of totally unavailable. Also, you can interact with the passive side but the types of interactions may be very different (i.e. read-only access) since there may be fewer resources available on the passive side and there may be delays in data replication.

Here are some references used to gather this information:

Erik Gillespie
  • 3,929
  • 2
  • 31
  • 48