0

I have a Rails production application that is down several times per day. This application, in addition to serving its users, is the endpoint for a 3rd party website that sends it updates.

Occasionally, these updates will come flooding in so fast that the requests back up and the application becomes unavailable for long periods of time. It is a legitimate usage which ends up causing a Denial of Service.

The request from the 3rd party is pretty simple:

class NotificationsController < ApplicationController

  def notify
    begin
      notification_xml = request.body.read
      notification_hash = Hash.from_xml(item_response_xml)['Envelope']['Body']['NotificationResponse']
      user = User.find(notification_hash['UserID'])
      user.delay.set_notification(notification_hash)
    rescue Exception => bang
      logger.error bang.backtrace
      unless user.blank?
        alert_file_name = "#{user.id}_#{notification_hash['Message']['MessageID']}_#{notification_hash['NotificationEventName']}_#{notification_hash['Timestamp']}.xml"
        File.open(alert_file_name, 'w') {|f| f.write(notification_xml) }
      end
    end
    render nothing: true, status: 200
  end

end

I have two app servers against a very large database. However, when this 3rd party website really hits us with the notification requests, over 200 per minute up to close to 1,000 requests per minute, both webservers get completely tied up.

You can also see above that I'm using the .delay call since I'm using Sidekiq. I thought that would help, and it did for a while, but the application can't handle that many requests.

Other than handling the requests in a separate application, which I'm not sure is really possible in my EngineYard installation, is there something I can do to speed up the handling of this request?

RubyRedGrapefruit
  • 12,066
  • 16
  • 92
  • 193

2 Answers2

2

If it takes too much to process all those request, try a different approach.

Create a new model (I will call it Request) with only one field (I'll name it message) - the xml sent to you by that 3rd party app.

Rewrite your notify action to be very simple and fast:

def notify
  Request.create(message: request.body)
  render nothing: true, status: 200
end

Create a new action, let's say process_requests like this:

def process_requests
  Request.order('id ASC')find_in_batches(100) do |group|
    group.each do |request|
      process_request(request)
      request.destroy
    end
  end
end

def process_request(notification_xml)
  begin
    notification_hash = Hash.from_xml(item_response_xml)['Envelope']['Body']['NotificationResponse']
    user = User.find(notification_hash['UserID'])
    user.set_notification(notification_hash)

  rescue Exception => bang
    logger.error bang.backtrace

    unless user.blank?
      alert_file_name = "#{user.id}_#{notification_hash['Message']['MessageID']}_#{notification_hash['NotificationEventName']}_#{notification_hash['Timestamp']}.xml"
      File.open(alert_file_name, 'w') {|f| f.write(notification_xml) }
    end
  end

Create a cron and call process_requests at a defined interval (few minutes). I never used Sidekiq so I preferred to use find_in_batches (I used a batch of 100 results just for the sake of example).

notify action shouldn't run for more than a few milliseconds (inserts are pretty fast) so this should be able to handle the incoming traffic in your critical moments.

If you try something similar and it helps your servers to reduce the load in critical moments let me know :D

If this will be useful and you insert background processing here too, please post that for the others to see.

cristian
  • 8,676
  • 3
  • 38
  • 44
1

If you're monitoring this app with New Relic/AppNet/something else, checking your reports might give you an idea of some long-hanging fruit. We've only got a small picture of the application here; it's possible that enhancements elsewhere in the app might help as well.

With that said, here are a few ideas which can be applied separately or together:

Do Less Work on Intake

Right now you're doing a bunch of XML processing—which is expensive—before you pass the job off to Sidekiq. That's a choke point, and by running in the app process it's tying up your application.

If your Redis instance has enough memory, consider refactoring notify so the whole XML payload gets passed off to Sidekiq. You're already always returning a 200 response to the API consumer, so there's no impact on your existing external API.

Your worker instances can then process the XML payloads at their own pace without impacting the application.

Implement API Throttling

The third-party site is hammering you at a tremendous rate not normally permitted even by huge sites. That's a problem.

If you can't get them to address it on their end, play like the big dogs: Implement request throttling on your end. You likely have some ability to do this at the Rack level on EngineYard (though a quick search of their docs didn't immediately yield anything), but even doing it at the application level is likely to improve things.

There's a previous Stack Overflow discussion that may offer a couple options.

Proxy the API

A few services exist that will proxy your API for you, allowing you to easily implement features like rate limiting, throttling, and quotas that might otherwise be difficult to add.

The one I'm familiar with off the top of my head is Azure's API Management service. If this isn't a revenue-generating project, the cost might be prohibitive. ($49/month postpaid, though it would be cheaper prepaid, or could even be free if you qualify for BizSpark.)

Farm the API Out

The more advanced cousin of API proxies, "API as a Service" actually lets you run your API on its own VM instance—as well as offering the features a proxy does. If your database isn't a choke point, this can be a way to spread the load out and help prevent machine clients from affecting the experience of human clients.

The ten thousand pound gorilla is Apigee, though there are a variety of other established and startup options.

There is a catch: Most of these services are built around Node.js. If your Rails app is already leaning toward service-oriented architecture, and if you know and like JavaScript, this may not be an issue for you. Otherwise, the need to build an interface between services and maintain a service in a second language may be a bridge too far.

Community
  • 1
  • 1
colinm
  • 4,258
  • 24
  • 19