1

I have a C# application that queues tasks in RabbitMQ to be handled by multiple worker machines for distributed computing. Each worker machine runs a process that hosts many handlers, each of which handles messages from a single queue. Each Handler is launched in its own app-domain. When this works properly, the queue handlers can stay online for months with no disruption or issues.

One of my queue handlers consistently stops receiving messages on worker machines. They don't all drop at once... it is a little bit at a time.

When adding logging, I found that the RabbitMQ connection is closed and specifies the close reason as "End of Stream". There is no deterministic amount of time or clear event that seems to cause this, and it only happen with this one handler. All of the others continue working, so it is not a networking issue.

What can I do to find the root cause? What aspect of the C# code handling the messages could cause the rabbit channel to close?

Thank you in advance for any ideas for logging, debugging, or fixing this issue.

2 Answers2

0

If each handler is in its own AppDomain, each must have its own IConnection. Do your workers publish any messages, such as updating and requeuing a message or publishing status information? I suspect that the the specific handler suddenly stops working because the connection is being abruptly closed when this specific worker performs some action to exceed the frame size on its connection. When a message exceeds the maximum frame size, the connection will be suddenly closed by the broker, resulting in the stream ending for the client. The broker log will show the frame size was exceeded and by how much.

The default frame limit is 128 KB. You should check if that worker does something that may exceed that. There are definitely some other things that can cause the broker to close the connection, but that's what gets me this most often. In any case, the rabbit log should help you find the cause.

DaveC
  • 364
  • 3
  • 10
  • The worker does occasionally publish or requeue, but the messages are no longer than 256 bytes... so frame size should not be an issue. I will try to turn on stronger logging and see if anything shows up. – David Schwartz Jun 15 '15 at 19:48
  • Is this the only worker that publishes or requeues? Or could it be more susceptible to some race condition if different operations are sharing an IModel instance? You need to serialize access to the model (a.k.a. channel) with a lock if there is a possibility that multiple threads my access it at once or it can lead to protocol errors which also can cause the server to kill the connection. – DaveC Jun 15 '15 at 19:57
0

IConnection.Dispose() might throw EndOfStreamException as mentioned here: RabbitMQ C# driver stops receiving messages

Community
  • 1
  • 1