How to set up Varnish caching for GraphQL in Django

Question

Here is my django-benchmark project in which I implemented simple REST API and GraphQL endpoints. In front of my application I put Varnish for caching. Caching works well for Rest HTTP endpoints and does not work for GraphQL here is my Varnish configuration. What am I doing wrong?

vcl 4.1;

# Default backend definition. Set this to point to your content server.
backend default {
    .host = "0.0.0.0";
    .port = "8080";
}

sub vcl_hash {
    # For multi site configurations to not cache each other's content
    if (req.http.host) {
        hash_data(req.http.host);
    } else {
        hash_data(server.ip);
    }
    
    # Cache GET requests coming to endpoint located at /api/graphql
    if (req.method == "POST" && req.url ~ "/api/graphql") {
        call process_graphql_headers;
    }
}

# TODO: Find way to cache GraphQL requests.
sub process_graphql_headers {
}

sub vcl_recv {
    # # Bypass authenticated requests, these requests should not be cached by default
    # if (req.http.Authorization ~ "^Bearer") {
    #     return (pass);
    # }
}

sub vcl_backend_response {
}

sub vcl_deliver {
}

Application Setup

Set up the project and seed database.

$ git clone https://github.com/ldynia/django-benchmark.git
$ cd django-benchmark/
$ docker-compose up -d
$ docker exec -it django-benchmark python manage.py seed 100

Application Testing

Query graphql endpoint directly (port 8080), or passing by varnish first (port 8888).

# Query graphql endpoint directly
$ curl 'http://localhost:8080/api/graphql/' \
  -X 'POST' \
  -H 'Content-Type: application/json' \
  --data-raw '{"query":"query { allDummy { results { id } }}"}'

# Query graphql endpoint hitting varnish first
$ curl 'http://localhost:8888/api/graphql/' \
  -X 'POST' \
  -H 'Content-Type: application/json' \
  --data-raw '{"query":"query { allDummy { results { id } }}"}'

Response

{
    "data": {
        "allDummy": {
            "results": [{
                "id": "1"
            }, {
                "id": "2"
            }, {
                "id": "3"
            },
            ...
            ]
        }
    }
}

HTTP Request & Response Headers

$ curl 'http://localhost:8888/api/graphql/' -X 'POST' -H 'Content-Type: application/json' --data-raw '{"query":"query { allDummy { results { id } }}"}' -v > /dev/null
Note: Unnecessary use of -X or --request, POST is already inferred.
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0*   Trying 127.0.0.1:8888...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 8888 (#0)
> POST /api/graphql/ HTTP/1.1
> Host: localhost:8888
> User-Agent: curl/7.68.0
> Accept: */*
> Content-Type: application/json
> Content-Length: 48
> 
} [48 bytes data]
* upload completely sent off: 48 out of 48 bytes
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Date: Mon, 08 Mar 2021 13:45:42 GMT
< Server: WSGIServer/0.2 CPython/3.7.9
< Content-Type: application/json
< Vary: Cookie
< X-Frame-Options: DENY
< Content-Length: 17128
< X-Content-Type-Options: nosniff
< Referrer-Policy: same-origin
< Set-Cookie:  csrftoken=86bgA2o83BavIOTq7Wf59pXxZPeJ65byTMt286UbyKfPSo9O1uefGw8gMP99plbL; expires=Mon, 07 Mar 2022 13:45:42 GMT; Max-Age=31449600; Path=/; SameSite=Lax
< X-Varnish: 32791
< Age: 0
< Via: 1.1 varnish (Varnish/6.4)
< Accept-Ranges: bytes
< Connection: keep-alive
< 
{ [17128 bytes data]
100 17176  100 17128  100    48   124k    358 --:--:-- --:--:-- --:--:--  125k

https://www.apollographql.com/blog/graphql-caching-the-elephant-in-the-room-11a3df0c23ad/ — xadm, Mar 04 '21 at 21:35
Basically it's not a simple/simply resolvable problem ... https://stackoverflow.com/a/65513275/6124657 — xadm, Mar 04 '21 at 22:51
It all depends on what you are trying to achieve. I see custom request headers like `Store` and `Content-Currency` being used as cache variations. What is their purpose? I also see the cache being bypassed if `Bearer` authentication is used. Is your GraphQL service using this kind of authentication? What needs to be cached? `GET` requests only? Or `POST` requests as well? Is request & response body introspection required? I'll need a lot more information before I can assist you with the VCL code. — Thijs Feryn, Mar 05 '21 at 07:42
@ThijsFeryn I edited VCL configuration, and I got rid of custom headers -I don't need them after all. I'd like to cache GraphQL queries that comes over http GET method designated to `/api/graphql` endpoint. *Is request & response body introspection required?* I guess I'd like to introspect only request I don't care about body at the moment. — Lukasz Dynowski, Mar 05 '21 at 10:30
@LukaszDynowski Please add a full GraphQL HTTP request & corresponding HTTP response to your question. Please also mention the request body criteria you want to filter on, and I'll suggest a VCL template for you. — Thijs Feryn, Mar 05 '21 at 14:01
@ThijsFeryn I've updated my question about the things that you requested. Moreover, I included instructions on how to setup the project and test application endpoints. Is that sufficient ? — Lukasz Dynowski, Mar 08 '21 at 13:52

How to set up Varnish caching for GraphQL in Django

Application Setup

Application Testing

Response

HTTP Request & Response Headers

0 Answers0