How many Requests per Minute are considered 'Heavy Load'? (Approximation)

Question

Often times people talk in their (optimization & performance related) questions and answers about 'heavy load'.

I'm trying to quantify this in the context of a regular web application on a typical server (take SO & its fairly small infrastructure as example) in a number of Requests per Minute, assuming that they return immediately (to simplify and take database speeds etc. out of the equation).

I'm looking for a nominal number/range, not 'where the CPU maxes out' or similar. A rough approximation would be great (e.g. >5000/min). Thank you!

score 51 · Accepted Answer · answered Aug 24 '09 at 00:40

I would think that the proper answer to this, given that you don't want the hardware load measure (CPU, memory, IO utilization), is that heavy load is the amount of requests per time unit at or over the required maximum amount of requests per time unit.

The required maximum amount of requests is what has been defined with the customer or with whomever is in charge of the overall architecture.

Say X is that required maximum load for the application. I think something like this would approximate the answer:

0 < Light Load < X/2 < Regular Load < 2X/3 < High Load < X <= Heavy Load

The thing with a single number out of thin air is that it has no relation whatsoever with your application. And what heavy load is is totally, absolutely, ineludibly tied to what the application is supposed to do.

Although 200 requests per second is a load that would keep small webservers busy (~12000 a minute).

score 23 · Answer 2 · answered Feb 05 '16 at 13:29

Several hundred requests per second.

The from-the-box number of open connections for most servers is usually around 256 or fewer, ergo 256 requests per second. You can push it up to 2000-5000 for ping requests or to 500-1000 for lightweight requests. Making it even higher is very difficult and requires changes all the way in network, hardware, OS, server application and user application (see problem 10k).

Seek speed + latency for HDDs is around 1-10ms, for SSDs it's 0.1-1 ms. So, it's 100-100 000 IOPS. Let's take 100 000 as top value (SSD consequential write)

Usually connection stays open for at least 1 x latency value ms. Latency from client to server is rarely below 50-100 ms, so only 100 000/50 = 2000 IOPS can create new connections.

So, 2000 ping request per second from different clients is a base upper limit for a normal server. It can be improved via usage of RAM disk or adding more SSDs to increase IOPS number, routing requests to reduce ping, changing/modifying OS to reduce kernel overhead etc. Usually it's also higher due to many requests coming from same client (connection) and limited number of clients at all. In good conditions it can go up to hundreds of thousands

On the other hand, higher ping, application execution time, OS and hardware imperfection can easily reduce the base value to several hundreds requests per second. Also, typical web servers and applications are usually not very well suited for high-level optimization, so Vinko Vrsalovic's suggestion of 200 is pretty realistic.

score 6 · Answer 3 · edited May 17 '18 at 07:14

This is not a straight forward question that can be answered with a simple requests/minute number.

In the telecom sector, we often do performance testing and we simulate running lots of calls per second to try and find out the limit. We keep upping the call rate until the server fails to keep up.

So, it depends on your server and what it can handle. It also depends on your perspective. For example, an old 386 might only handle a measly 50 requests/minute. I'd call that a light load. But a high spec'd server might be capable of handling 60000 requests/minute. This is just guessing. I have no idea whether Apache could do this. Our telecom software certainly can.

I think it's best to answer this from the server perspective. I would say very heavy load is when you come within 10% of what your server is capable of handling sustained over several minutes or tens of minutes. Heavy load within 15%.

duffymo · Answer 4 · 2009-08-24T00:42:38.447

It's hard to answer, because load isn't simply a matter of requests per unit time. It depends on what those requests are doing and how they're implemented.

For example, more reads than writes might mean a lighter load.

Asynchronous processing of writes might mean a lighter load than having to wait for synchronous processing to complete.

One extreme would be stock trading systems that handle billions of transactions each trading day. Look at the typical volume on the NYSE or NASDAQ and use that to estimate a high value per minute.

Let's say 2B transactions in a trading day is representative for NASDAQ. Markets open at 9AM and close at 4PM, so that's 7 hours*3600 seconds/hour = 25200 seconds. That would give an average of 2B transactions/25200 seconds = 79,365 transactions per second - a very high load, indeed. They obviously use lots of servers, so you'd need that number to figure out what the load per server should be.

If SO can be considered a good benchmark, you might ask about its volume on meta.

yes and OPRA > 1.2 million messages/sec NASDAQ and NYSE feeds are a piece of cake...... — pgast, Aug 24 '09 at 01:20
Actually, as of 2014, NASDAQ only handles about 10 million trades per day. That is about 2 billion shares per day, but not 2 billion transactions. Each trade probably involves multiple transactions, but not nearly enough to get to the levels quoted. — Ted Dunning, Aug 11 '14 at 05:53

score 2 · Answer 5 · answered Dec 19 '18 at 11:20

Heavy load is whatever is greater than what was stated in the requirements. You need to know how your application will be used to determine what might constitute as heavy load. Otherwise you might end up building a Ferrari that will only be used to do the groceries. Great experience, but waste of resources.

How many Requests per Minute are considered 'Heavy Load'? (Approximation)

5 Answers5

Several hundred requests per second.

Linked