Request Processing
Nginx request processing is is done with a number of different phases, with each phase having one or more handlers. Modules can register to run at a specific phase.
http://www.nginxguts.com/phases/
http://nginx.org/en/docs/dev/development_guide.html#http_phases
Rate Limiting
Rate limiting is applied by the ngx_http_limit_req_module
in the pre access phase. From ngx_http_limit_req_module.c:
h = ngx_array_push(&cmcf->phases[NGX_HTTP_PREACCESS_PHASE].handlers);
Caching
Caching is done later, I think in the content phase.
I couldn't quite figure this out from looking at the code or the documentation. But I was able to demonstrate this with a debug build. My configuration had rate limiting 1 request per second, with caching on. See the following excerpts from my log.
Cached Request
...
2020/08/01 11:11:07 [debug] 17498#0: *7 http header done
2020/08/01 11:11:07 [debug] 17498#0: *7 rewrite phase: 0
2020/08/01 11:11:07 [debug] 17498#0: *7 test location: "/"
2020/08/01 11:11:07 [debug] 17498#0: *7 using configuration "=/"
2020/08/01 11:11:07 [debug] 17498#0: *7 http cl:-1 max:1048576
2020/08/01 11:11:07 [debug] 17498#0: *7 rewrite phase: 2
2020/08/01 11:11:07 [debug] 17498#0: *7 post rewrite phase: 3
2020/08/01 11:11:07 [debug] 17498#0: *7 generic phase: 4
2020/08/01 11:11:07 [debug] 17498#0: *7 http script var: ....
2020/08/01 11:11:07 [debug] 17498#0: shmtx lock
2020/08/01 11:11:07 [debug] 17498#0: shmtx unlock
2020/08/01 11:11:07 [debug] 17498#0: *7 limit_req[0]: 0 0.000
2020/08/01 11:11:07 [debug] 17498#0: *7 generic phase: 5
2020/08/01 11:11:07 [debug] 17498#0: *7 access phase: 6
2020/08/01 11:11:07 [debug] 17498#0: *7 access phase: 7
2020/08/01 11:11:07 [debug] 17498#0: *7 post access phase: 8
2020/08/01 11:11:07 [debug] 17498#0: *7 generic phase: 9
2020/08/01 11:11:07 [debug] 17498#0: *7 generic phase: 10
2020/08/01 11:11:07 [debug] 17498#0: *7 http init upstream, client timer: 0
2020/08/01 11:11:07 [debug] 17498#0: *7 http cache key: "http://127.0.0.1:9000"
2020/08/01 11:11:07 [debug] 17498#0: *7 http cache key: "/"
2020/08/01 11:11:07 [debug] 17498#0: *7 add cleanup: 00005609F7C51578
2020/08/01 11:11:07 [debug] 17498#0: shmtx lock
2020/08/01 11:11:07 [debug] 17498#0: shmtx unlock
2020/08/01 11:11:07 [debug] 17498#0: *7 http file cache exists: 0 e:1
2020/08/01 11:11:07 [debug] 17498#0: *7 cache file: "/home/poida/src/nginx-1.15.6/objs/cache/157d4d91f488c05ff417723d74d65b36"
2020/08/01 11:11:07 [debug] 17498#0: *7 add cleanup: 00005609F7C46810
2020/08/01 11:11:07 [debug] 17498#0: *7 http file cache fd: 12
2020/08/01 11:11:07 [debug] 17498#0: *7 read: 12, 00005609F7C46890, 519, 0
2020/08/01 11:11:07 [debug] 17498#0: *7 http upstream cache: 0
2020/08/01 11:11:07 [debug] 17498#0: *7 http proxy status 200 "200 OK"
2020/08/01 11:11:07 [debug] 17498#0: *7 http proxy header: "Server: SimpleHTTP/0.6 Python/3.8.5"
2020/08/01 11:11:07 [debug] 17498#0: *7 http proxy header: "Date: Sat, 01 Aug 2020 01:11:03 GMT"
2020/08/01 11:11:07 [debug] 17498#0: *7 http proxy header: "Content-type: text/html; charset=utf-8"
2020/08/01 11:11:07 [debug] 17498#0: *7 http proxy header: "Content-Length: 340"
2020/08/01 11:11:07 [debug] 17498#0: *7 http proxy header done
2020/08/01 11:11:07 [debug] 17498#0: *7 http file cache send: /home/poida/src/nginx-1.15.6/objs/cache/157d4d91f488c05ff417723d74d65b36
2020/08/01 11:11:07 [debug] 17498#0: *7 posix_memalign: 00005609F7C46DC0:4096 @16
2020/08/01 11:11:07 [debug] 17498#0: *7 HTTP/1.1 200 OK
...
Rate Limited Request
...
2020/08/01 11:17:04 [debug] 17498#0: *10 http header done
2020/08/01 11:17:04 [debug] 17498#0: *10 rewrite phase: 0
2020/08/01 11:17:04 [debug] 17498#0: *10 test location: "/"
2020/08/01 11:17:04 [debug] 17498#0: *10 using configuration "=/"
2020/08/01 11:17:04 [debug] 17498#0: *10 http cl:-1 max:1048576
2020/08/01 11:17:04 [debug] 17498#0: *10 rewrite phase: 2
2020/08/01 11:17:04 [debug] 17498#0: *10 post rewrite phase: 3
2020/08/01 11:17:04 [debug] 17498#0: *10 generic phase: 4
2020/08/01 11:17:04 [debug] 17498#0: *10 http script var: ....
2020/08/01 11:17:04 [debug] 17498#0: shmtx lock
2020/08/01 11:17:04 [debug] 17498#0: shmtx unlock
2020/08/01 11:17:04 [debug] 17498#0: *10 limit_req[0]: -3 0.707
2020/08/01 11:17:04 [error] 17498#0: *10 limiting requests, excess: 0.707 by zone "mylimit", client: 127.0.0.1, server: , request: "GET / HTTP/1.1", host: "localhost:8080"
2020/08/01 11:17:04 [debug] 17498#0: *10 http finalize request: 503, "/?" a:1, c:1
2020/08/01 11:17:04 [debug] 17498#0: *10 http special response: 503, "/?"
2020/08/01 11:17:04 [debug] 17498#0: *10 http set discard body
2020/08/01 11:17:04 [debug] 17498#0: *10 HTTP/1.1 503 Service Temporarily Unavailable
...
For a rate limited request, processing stops before the server tries to generate the content or check the cache.
TL;DR; Rate limiting is applied first, before caching.