1

I have a web server (Java + Jooby + undertow), which needs to load a big data model (about 200MB) from aws s3 periodically. And I've also done what I could to avoid the gc issue: every time the big binary data is loaded to a pre-allocated bytebuffer, and I also uses zero-copy data schema flatbuffers to serialize the model.

But I find that every time the big model was loaded, there would be a request latency spike. I even tried to disable the model deserialization, but the latency spike still exists.

My question is: How could I load the big model in web server without any impact on the performance (latency)?

Russell Bie
  • 341
  • 2
  • 11
  • Which java version and GC do you use? In a server environment, if you have enough RAM, I would recommend to use G1GC (default GC in Java9+). – JMax Nov 05 '19 at 15:08
  • @JMax This my jvm options: -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Xms8192m -Xmx8192m -XX:+UseG1GC -XX:+AggressiveOpts -XX:+UseLargePages -server And I think *G RAM is enough for my application. – Russell Bie Nov 06 '19 at 03:25
  • Export on the source system in file format than transfer by using some file transfer system and finaly load on target system – Lee Nov 06 '19 at 07:56
  • 1
    Doing things takes time, that’s the way it is. You can’t deserialize 200MB of data without raising the latency. But why does your web server need to load this data periodically? Why cant’t it keep the data in memory? – Holger Nov 06 '19 at 09:02
  • @Holger because my model is training continuously, the model data need to be uploaded to online server periodically. – Russell Bie Nov 16 '19 at 14:23

1 Answers1

0

I found that downloading the big s3 file (either to local disk or memory) would cause service latency spikes. And restricting the downloading thread count would mitigate this issue. And finally, I moved the downloading logic to another process, which seems resolve this issue. But when the CPU consumption is high, we can still see the little spike.

Russell Bie
  • 341
  • 2
  • 11