6

I've recently started hosting a side project of mine on the new Azure VMs. The app uses Redis as an in-memory cache. Everything was working fine in my local environment but now that I've moved the code to Azure I'm seeing some weird exceptions coming out of Booksleeve.

When the app first fires up everything works fine. However, after about 5-10 minutes of inactivity the next request to the app experiences a network exception (I'm at work right now and don't have the exact error messages on me, so I will post them when I get home if people think they're germane to the discussion) This causes the internal MessageQueue to close, which results in every subsequent Enqueue() throwing an exception ("The Queue Is Closed").

So after some googling I found this SO post: Maintaining an open Redis connection using BookSleeve about a DIY connection manager. I can certainly implement something similar if that's the best course of action.

So, questions:

  1. Is it normal for the RedisConnection to close periodically after a certain amount of time?
  2. I've seen the conn.SetKeepAlive() method but I've tried many different values and none seem to make a difference. Is there more to this or am I barking up the wrong tree?
  3. Is the connection manager idea from the post above the best way to handle this scenario?
  4. Can anyone shed any additional light on why hosting my Redis instance in a new Azure VM causes this issue? I can also confirm that if I run my local environement against the Azure Redis VM I experience this issue.

Like I said, if it's unusual for a Redis connection to die after inactivity, I will post the stack traces and exceptions from my logs when I get home.

Thanks!

UPDATE Didier pointed out in the comments that this may be related to the load balanacer that Azure uses: http://blogs.msdn.com/b/avkashchauhan/archive/2011/11/12/windows-azure-load-balancer-timeout-details.aspx

Assuming that's the case, what would be the best way to implement a connection manager that could account for this goofy problem. I assume I shouldn't create a connection per unit of work right?

Community
  • 1
  • 1
Eric
  • 3,284
  • 1
  • 28
  • 29
  • What do you have in the timeout parameter of the Redis configuration file? This is the idle timeout (set it to 0 to avoid Redis closing idle connections). – Didier Spezia Jun 12 '12 at 17:16
  • The timeout is set to 0 already. Just double checked it as well. :( – Eric Jun 12 '12 at 17:25
  • 1
    It seems to be a "feature" of AzureVM ... See http://blogs.msdn.com/b/avkashchauhan/archive/2011/11/12/windows-azure-load-balancer-timeout-details.aspx – Didier Spezia Jun 12 '12 at 17:37
  • Oh wow - that's what I was afraid of :( Thanks for the information! I'm going to edit my question a bit and see if anyone else knows of a good way to work around this "feature" with Booksleeve. I have some ideas but ultimately would like an expert to weigh in. – Eric Jun 12 '12 at 17:42
  • 1
    @Eric out of curiosity - does your redis server have a connection timeout configured? If so, BookSleeve should automatically configure itself with a heartbeat... Maybe that would stop the load balancer from killing it? – Marc Gravell Jun 13 '12 at 15:20
  • @MarcGravell It currently has the "timeout" config value set to 0. Is this what you're thinking of? Should I try setting the timeout to something > 0 (maybe like 3600) to see if that makes a difference? – Eric Jun 13 '12 at 20:06
  • 1
    @Eric 3600 is an hour... any idea what the azure kill time is? I would say "shorter than that". Note that this is just to see whether booksleeve's heartbeat will keep the TCP alive - if it works, we can probably get it working without you needing to change the config. As a minor note: you don't need to restart redis to change the connection timeout - pretty sure you can do that in a redis-cli (or similar) session – Marc Gravell Jun 13 '12 at 20:22
  • @MarcGravell Ahh, I'm an idiot - I had things reversed in my head. I was thinking longer timeout was better for some reason :) The best information I could find on the Azure load balancer timeouts is "longer than 60 seconds" so I think if I set the Redis timeout to 30 I should be ok? I'll try that tonight when I get home from work and let you know how it goes! Thanks for your help! – Eric Jun 13 '12 at 20:31
  • @MarcGravell That seemed to do the trick! I switched the redis timeout to 30 seconds and since then have not experienced the issue! I'd be interested to see what your idea was for not having to the change the Redis config, but the current solution is completely acceptable so I'm a very happy camper! – Eric Jun 14 '12 at 02:33
  • @Eric *probabably* I mean `config set timeout 30`. – Marc Gravell Jun 14 '12 at 06:32

2 Answers2

6

From other answers/comments, it sounds like this is caused by the azure infrastructure shutting down sockets that look idle. You could simply have a timer somewhere that performs some kind of operation periodically, but note that this is already built into Booksleeve: when it connects, it checks what the redis connection timeout is, and configures a heartbeat to prevent redis from closing the socket. You might be able to piggy-back this to prevent azure closing the socket too. For example, in a redis-cli session:

config set timeout 30

should configure redis (on the fly, without having to restart) to have a 30 second connection timeout. Booksleeve should then automatically take steps to ensure that there is a heartbeat shortly before 30 seconds. Note that if this is successful, you should also edit your configuration file so that this setting applies after the next restart too.

Marc Gravell
  • 1,026,079
  • 266
  • 2,566
  • 2,900
  • It's worth pointing out you need to actually set a timeout > 0 here for Booksleeve to do automated Ping() checks when conn.SetKeepAlive(true) is set. So on Azure, you need to set timeout to 30 seconds as above and also conn.SetKeepAlive(true). Without a config timeout set, Booksleeve won't do ping checks (even with keepalive = true, an internal check inside Booksleeve skips pings when there is no timeout set). This would mean Azure still shuts down the socket due to inactivity. – Andrew Jan 29 '13 at 00:30
  • Does StackExchange.Redis have this same heart beating behavior if a timeout is present in the redis config? – slypete Jul 07 '14 at 18:32
  • 1
    @slypete yes, but it also has a default heartbeat if no timeout is present – Marc Gravell Jul 07 '14 at 19:50
1

The Load Balancer in Windows Azure will close the connection after X amount of time depend on total connection load on load balancer and because of it you will get a random timeout in your connection.

As I am not well known to Redis connections I am unable to suggest how to implement it correctly however in general the suggested workaround is the have a heartbeat pulse to keep your session alive. Have you have chance to look for the workaround suggested in blog and try to implement in Redis, if that works out for you?

AvkashChauhan
  • 20,495
  • 3
  • 34
  • 65
  • Thanks Avkash. I was actually going to try using a Virtual Network in the new Azure preview, but unfortunately I am not allowed to create a VM from a custom image AND add it to an Affinity group. Apparently that's only allowed from the Quick Create screen? I have submitted a forum post to the virtual network forum as well. – Eric Jun 12 '12 at 23:21
  • Hi Eric, I saw your post, and will be able to help you on VM creation issue however there is some issue with MSDN forum so I can not not contact you. Once that is resolve I will contact you to see what can be done.. thanks.. – AvkashChauhan Jun 12 '12 at 23:42
  • I believe I know what's going on. Apparently the image I created is only available at the location that it started life in (Which was US West) - I created the vlan after the image, and now it appears it is not possible to move the image from one location to another. Do you have any idea if creating a vlan is going to be noticeably better from a performance standpoint? I mean, it's not too onerous to just create a heartbeat in my app if need be. – Eric Jun 12 '12 at 23:50