How to set rate limit for each user in Spring Boot?

Question

I am developing a Spring Boot Rest API which handles a lots of incoming request calls. My Controller is something like below:

@RestController

public class ApiController {
    List<ApiObject>  apiDataList;   

    @RequestMapping(value="/data",produces={MediaType.APPLICATION_JSON_VALUE},method=RequestMethod.GET)
    public ResponseEntity<List<ApiObject>> getData(){                                       
        List<ApiObject> apiDataList=getApiData();
        return new ResponseEntity<List<ApiObject>>(apiDataList,HttpStatus.OK);
    }
    @ResponseBody 
    @Async  
    public List<ApiObject>  getApiData(){
        List<ApiObject>  apiDataList3=new List<ApiObject> ();
        //do the processing
        return apiDataList3;
    }
}

So now I wanted to set a ratelimit for each user. Say every user can only request 5 request per minute or something like that. How to set the rate limit for each user to make only 5 api calls per minute and if a user requests more than that I can send a 429 response back? Do we need thier IP Address?

Any help is appreciated.

This is best and easiest done at the web server level. See [NGinx HTTP Limit module](http://nginx.org/en/docs/http/ngx_http_limit_req_module.html) or [Apache Rate Limit](https://httpd.apache.org/docs/trunk/mod/mod_ratelimit.html) module. — manish, May 18 '17 at 08:51
I hope that can limit the accumulated api calls. For example if it restricts 5 calls per minute and we have 10 users then it limits to 50 calls per minute.What happens if one user requested 40 and the rest 10 only within the a few seconds.Will it restrict all the api calls? — Ricky, May 18 '17 at 09:04
You can look [here](http://stackoverflow.com/a/38479810/1125284) and I hope [Guava's RateLimiter](https://dzone.com/articles/ratelimiter-discovering-google) will help you! — Zico, May 18 '17 at 09:10
Take a look at this answer http://stackoverflow.com/q/27595683/1061499. Also this post seems interesting: http://ec2-52-59-233-40.eu-central-1.compute.amazonaws.com/java-spring-mvc-rate-limit/ — davioooh, May 18 '17 at 10:17
Take a look at [Bucket4j](https://github.com/vladimir-bukhtoyarov/bucket4jBucket4j). I've started a [Spring Boot Starter for Bucket4j](https://github.com/MarcGiffing/bucket4j-spring-boot-starter) — meleagros, Jul 23 '17 at 20:02
@Ricky did you explore the options of using the embedded web servers for this purpose. Now we need this functionality for your project. I wanted to know what the final option you have chosen. — Onki, Aug 05 '19 at 04:45

Maurice · Answer 1 · 2022-01-19T01:52:04.950

Here is a solution for those who seek to throttle the requests per second for each user (ip address). This solution requires the Caffeine library which is a java 1.8+ rewrite of Google's Guava library. You are going to use the LoadingCache class for storing the request counts and client ip addresses. You will also be needing the javax.servlet-api dependency because you will want to use a servlet filter where the request counting takes place. Heres the code:

import javax.servlet.Filter;


@Component
public class requestThrottleFilter implements Filter {

    private int MAX_REQUESTS_PER_SECOND = 5; //or whatever you want it to be

    private LoadingCache<String, Integer> requestCountsPerIpAddress;

    public requestThrottleFilter(){
      super();
      requestCountsPerIpAddress = Caffeine.newBuilder().
            expireAfterWrite(1, TimeUnit.SECONDS).build(new CacheLoader<String, Integer>() {
        public Integer load(String key) {
            return 0;
        }
    });
    }

    @Override
    public void init(FilterConfig filterConfig) throws ServletException {

    }

    @Override
    public void doFilter(ServletRequest servletRequest, ServletResponse servletResponse, FilterChain filterChain)
            throws IOException, ServletException {
        HttpServletRequest httpServletRequest = (HttpServletRequest) servletRequest;
        HttpServletResponse httpServletResponse = (HttpServletResponse) servletResponse;
        String clientIpAddress = getClientIP((HttpServletRequest) servletRequest);
        if(isMaximumRequestsPerSecondExceeded(clientIpAddress)){
          httpServletResponse.setStatus(HttpStatus.TOO_MANY_REQUESTS.value());
          httpServletResponse.getWriter().write("Too many requests");
          return;
         }

        filterChain.doFilter(servletRequest, servletResponse);
    }

    private boolean isMaximumRequestsPerSecondExceeded(String clientIpAddress){
      Integer requests = 0;
      requests = requestCountsPerIpAddress.get(clientIpAddress);
      if(requests != null){
          if(requests > MAX_REQUESTS_PER_SECOND) {
            requestCountsPerIpAddress.asMap().remove(clientIpAddress);
            requestCountsPerIpAddress.put(clientIpAddress, requests);
            return true;
        }

      } else {
        requests = 0;
      }
      requests++;
      requestCountsPerIpAddress.put(clientIpAddress, requests);
      return false;
      }

    public String getClientIP(HttpServletRequest request) {
        String xfHeader = request.getHeader("X-Forwarded-For");
        if (xfHeader == null){
            return request.getRemoteAddr();
        }
        return xfHeader.split(",")[0]; // voor als ie achter een proxy zit
    }

    @Override
    public void destroy() {

    }
}

So what this basically does is it stores all request making ip addresses in a LoadingCache. This is like a special map in which each entry has an expiration time. In the constructor the expiration time is set to 1 second. That means that on the first request an ip address plus its request count is only stored in the LoadingCache for one second. It is automatically removed from the map on expiration. If during that second more requests are coming from the ip address then the isMaximumRequestsPerSecondExceeded(String clientIpAddress) will add those requests to the total request count but before that checks whether the maximum request amount per second has already been exceeded. If thats the case it returns true and the filter returns an error response with statuscode 429 which stands for Too many requests.

This way only a set amount of requests can be made per user per second.

Here is the Caffeine dependency to add to your pom.xml

    <dependency>
        <groupId>com.github.ben-manes.caffeine</groupId>
        <artifactId>caffeine</artifactId>
        <exclusions>
            <exclusion>
                <artifactId>logback-classic</artifactId>
                <groupId>ch.qos.logback</groupId>
            </exclusion>
            <exclusion>
                <artifactId>log4j-over-slf4j</artifactId>
                <groupId>org.slf4j</groupId>
            </exclusion>
        </exclusions>
    </dependency>

Please note the <exclusion> part. I am using log4j2 as a logger library instead of Spring's default logback library. If you are using logback then you should remove the <exclusion> part from these POM dependency or logging will not be enabled for this library.

EDIT: Make sure you let Spring do a component scan on the package where you have your Filter saved or else the Filter won't work. Also, because it is annotated with @Component the filter will work for all endpoints by default (/*).

If spring detected your filter you should see something like this in the log during startup.

o.s.b.w.servlet.FilterRegistrationBean : Mapping filter:'requestThrottleFilter' to: [/*]

EDIT 19-01-2022:

I've noticed that my initial solution has one drawback when it comes to blocking too many requests and i've changed the code because of it. I'll first explain why.

Consider a user can make 3 requests per second. Lets imagine that within a given second the user makes the first request during the first 200 milliseconds of that second. This causes an entry for that user to be added to requestCountsPerIpAddress and entry will automatically expire after one second. Now consider that this same user makes 4 successive requests only in the final 100 milliseconds before the second elapses and the entry is deleted. That means that the user effectively only gets blocked for a mere 100 milliseconds at maximum on the fourth request attempt. After those 100 milliseconds pass he'll be able to immediately make three new requests.

As a consequence of this he is also able to make 5 requests within a second instead of 3. This can happen when there is atleast a 500 millisecond delay between the first request (which creates the entry in the LoadingCache) and the next two requests (both made in the last 500 milliseconds before the current entry expires). if the user then immediately makes 3 requests right after the entry expired he will effectively manage to make 5 requests within a timespan of 1 second, whereas only 3 are allowed (2 made during the last 500 ms before the previous entry expired + 3 made during the first 500ms of the new one). So that's not a very efficient way to throttle the requests.

I've changed the library to caffeine because there are some deadlock issues with guava library. If you want to keep using guava library itself you should add this line requestCountsPerIpAddress.asMap().remove(clientIpAddress); right under if(requests > MAX_REQUESTS_PER_SECOND) { in the code. What this basically does is remove the current entry for the ip address. Then on the next line it gets added again which resets the expiry time back to one whole second for that entry.

This has the effect that anyone who just keeps spamming the REST endpoint with requests will indefinitely get a 409 response back until the user stops sending requests for one second after his last request.

@VijaySatluri i made a small improvement. I've added `requestCountsPerIpAddress.put(clientIpAddress, requests);` to the if clause `if(requests > MAX_REQUESTS_PER_SECOND){`. This will prevent the ip/requestcount pair from expiring after one second when the spammer continues to bombarde the server with requests. Only when there is a pause of 1 second between requests the entry expires. — Maurice, Jan 12 '20 at 01:23
Made one more addition. Its best to not use sendError() or so i've found out. This method will not abort the request but will instead continue executing the code in the filter and that of any filters after it. Its better to simply return the filter method without allowing filterchain.dofilter to be called. You simply adjust the content of httpResponse by providing the error status code and a message with the writer() method. — Maurice, Jan 12 '20 at 02:06
Beware of this answer: it does NOT use the leaky bucket algorithm. Which means that in case of this 5 requests / second setup, the last request of 6 consecutive request done every 0.5 second will be blocked. So this does not actually mean 5 requests / second, but 5 requests at most where each of the requests do not have at least 1 second timeout. Very different. — andras, May 29 '21 at 19:01
@andras i've made an important change to the code which makes it a lot better at throttling requests, please read the EDIT text to understand why. — Maurice, Jan 19 '22 at 01:03
@VijaySatluri i've made an important change to the code which makes it a lot better at throttling requests, please read the EDIT text to understand why. — Maurice, Jan 19 '22 at 01:03
@trilogy i've made an important change to the code which makes it a lot better at throttling requests, please read the EDIT text to understand why. — Maurice, Jan 19 '22 at 01:03
@Augusto i've made an important change to the code which makes it a lot better at throttling requests, please read the EDIT text to understand why. — Maurice, Jan 19 '22 at 01:03
The dependencies don't work. Is LoadingCache part of Guava or Caffeine? — trilogy, Jan 21 '22 at 15:31
@trilogy its part of Caffeine, please add `import com.github.benmanes.caffeine.cache.LoadingCache` — Maurice, Jan 21 '22 at 15:39
When deriving the “real client IP address” from the X-Forwarded-For header, use the rightmost IP in the list. The leftmost IP in the XFF header is commonly considered to be “closest to the client” and “most real”, but it’s trivially spoofable. Don’t use it for anything even close to security-related. should be xfHeader.split(",").last(); — Holm, May 11 '22 at 08:18
How would you limit this to just one endpoint? I mean you can always check in the filter where was the request performed to, but asking whether there is something more elegant. — Renis1235, Jun 17 '22 at 20:45
@Renis1235 you can use the `HttpServletRequest` object to find out which endpoint is being called. Then you can use that information to only run the throttle code when a specific endpoint is called. `if(httpServletRequest.getRequestURI().contains("/your-endpoint")){//rate limit code}` — Maurice, Jun 18 '22 at 16:22

score 26 · Accepted Answer · edited Jul 31 '19 at 11:21

You don't have that component in Spring.

You can build it as part of your solution. Create a filter and register it in your spring context. The filter should check incoming call and count the incoming requests per user during a time window. I would use the token bucket algorithm as it is the most flexible.
You can build some component that is independent of your current solution. Create an API Gateway that does the job. You could extend Zuul gateway and, again, use the token bucket algorithm.
You can use an already built-in component, like Mulesoft ESB that can act as API gateway and supports rate limiting and throttling. Never used it myself.
And finally, you can use an API Manager that has rate limiting and throttling and much more. Checkout MuleSoft, WSO2, 3Scale,Kong, etc... (most will have a cost, some are open source and have a community edition).

how about using this functionality in spring boot embedded web servers. I am exploring over this option but not getting any specific lead. Can this be availed on web servers end provided by spring boot. — Onki, Aug 05 '19 at 04:47

Lukasz R. · Answer 3 · 2018-12-13T08:43:11.880

24

Spring does not have rate-limiting out of the box.

There is bucket4j-spring-boot-starter project which uses bucket4j library with token-bucket algorithm to rate-limit access to the REST api. You can configure it via application properties file. There is an option to limit the access based on IP address or username.

As an example simple setup which allows a maximum of 5 requests within 10 seconds independently from the user:

bucket4j:
  enabled: true
  filters:
  - cache-name: buckets
    url: .*
    rate-limits:
    - bandwidths:
      - capacity: 5
    time: 10
    unit: seconds

If you are using Netflix Zuul you could use Spring Cloud Zuul RateLimit which uses different storage options: Consul, Redis, Spring Data and Bucket4j.

edited Dec 13 '18 at 08:43

answered Dec 12 '18 at 15:21

Lukasz R.

2,265
1
24
22

how about using this functionality in spring boot embedded web servers. I am exploring over this option but not getting any specific lead. Can this be availed on web servers end provided by spring boot. – Onki Aug 05 '19 at 04:47
I tried with bucket4j-spring-boot-starter + spring-boot-starter-data-redis + spring-boot-starter-cache and it works like a charm. I do not recommend using Zuul for the sole purpose of caching. – Philippe Simo Sep 19 '22 at 21:17

How to set rate limit for each user in Spring Boot?

3 Answers3

Linked