4

We have recently switched to ExpediaGroups GraphQL library which is based on Spring Webflux.

Since switching our Jaeger Traces show gaps before and after the last database query / span is created: enter image description here There is no computation heavy work done before or after performing the aforementinoned database queries besides trivial entity -> DTO mapping. An initial investigation via VisualVM has shown no obvious hotspots but we are losing an overall of 3-6ms in a local environment (dev, prod suffer even more) which increases the overall response time by 100%.

We are at a loss where to go from here: Is it a WebFlux issue? Is it a issue in regards to the linked library? We are executing all of our logic in a separate ThreadPool which is not saturated (these results are present for one client non-concurrently calling our GraphQL endpoint) so we shouldn't block the event loop of Netty (even if we did: it shouldn't create those "gaps" if my understanding of Webflux is correct).

I am looking for a way to further investigate this issue or any configuration knobs.

The "gaps" in between the database calls have been identified to be framework related and can be circumvented by restructuring our code, the head and tail "gaps" cannot be accounted for in profilers / worked around. Furthermore we are not losing any tracing related information across thread boundaries, that has been accounted for.

Additional information:

  • Our response size on average is below 1 kb
  • There is no reverse proxy in front of this service in a local environment which also exhibits this issue
  • All traffic is HTTP, not HTTPS
roookeee
  • 1,710
  • 13
  • 24
  • I have also posted at the libraries GitHub which contains a minimum reproducible example: https://github.com/ExpediaGroup/graphql-kotlin/discussions/1409#discussioncomment-2557486 – roookeee Apr 15 '22 at 17:50
  • 1
    Have you tried using a PreparsedDocumentProvider to avoid an overhead in parsing queries? See the official documentation for more details: https://docs.spring.io/spring-graphql/docs/current-SNAPSHOT/reference/html/#execution-graphqlsource-operation-caching – Korashen May 11 '22 at 19:29
  • 1
    I implemented a PreparsedDocumentProvider and made sure it got linked by expediagroups framework. While I feel like the performance improved overall the gaps still remain (looking at a trace right now: 3ms in the front, 2ms in the back while the actual work takes ~5ms combined, so a 100% overhead) - thank you for the idea anyway – roookeee May 12 '22 at 06:55

2 Answers2

0

It has many different causes, such as:

  • slow computer processing speed
  • spring-webflux glitch
  • long / unwieldy code
  • local environment issue / can't run on it

Try running the code on a different computer or a public environment

0

After a lot of debugging we have found the following answers:

  • The "gaps" in front and at the end of each request is just pure I/O of the incoming requests and the corresponding response. This also explains the bigger gaps in production when the requestee is slowly consuming the response / sending the request which doesn't hold true for localhost requests
  • The larger gaps in between can be attributed to lacking warmup (JIT) which seems necessary for the WebFlux stack
  • The overall "bad" performance is also warmup related (JIT)

Furthermore the suggested PreparsedDocumentProvider GraphQL caching (thank you Korashen) seems to be giving a decent performance uplift:

    @Bean
    fun preparsedDocumentProvider(): PreparsedDocumentProvider = object : PreparsedDocumentProvider {
        @Suppress("MagicNumber")
        private val cache: Cache<String, PreparsedDocumentEntry> = Cache2kBuilder
            .of(String::class.java, PreparsedDocumentEntry::class.java)
            // avoid cache attacks
            .entryCapacity(1024)
            .eternal(true)
            .build()

        override fun getDocument(
            executionInput: ExecutionInput,
            parseAndValidateFunction: Function<ExecutionInput, PreparsedDocumentEntry>
        ) = cache.computeIfAbsent(executionInput.query) { parseAndValidateFunction.apply(executionInput) }
    }
roookeee
  • 1,710
  • 13
  • 24