I have a web server (Java + Jooby + undertow), which needs to load a big data model (about 200MB) from aws s3 periodically. And I've also done what I could to avoid the gc issue: every time the big binary data is loaded to a pre-allocated bytebuffer, and I also uses zero-copy data schema flatbuffers to serialize the model.
But I find that every time the big model was loaded, there would be a request latency spike. I even tried to disable the model deserialization, but the latency spike still exists.
My question is: How could I load the big model in web server without any impact on the performance (latency)?