0

I'm currently facing a throttle with web-worker.

Here is what I'm trying to do:

  1. Main thread: sends a request to web-worker
  2. Web-worker: forward this request to a web service. The web service answers with ~1MiB of data
  3. Web-worker: post processing the data from the web service, expanding the payload to ~4MiB
  4. Web-worker: sends the 4MiB payload to Main thread
  5. Main thread: Receives the message, deserialize it (the time consuming part). Accessing MessageEvent.data is the bottleneck of the whole message reception handling.
  6. Main thread: put the data into a Map

The thing is, deserializing the result message in the main thread is quite time consuming, i.e. between 20ms and 40ms.

Is there a way to share the initial object, as is, to avoid this serialization/deserialization that is not needed in my case ? It really seams that the usage of structuredClone is what is the real problem in my application, as calling the field MessageEvent.data is the bottleneck of the whole message reception handling.

The web worker is truly interesting to process large amount of data and long Rest requests, but the method of communication between threads is seriously putting me off.

EDIT: clarification on what does each thread.

Raphallal
  • 122
  • 1
  • 12
  • If it is that the main thread locks for 20-40 ms MAYBE this would work to lessen that time: Serialize the data to JSON in the worker, then the main thread does not start 'implicit' deserialization when receiving the result as a string. Let the main thread create a data-url and use in await(await(fetch('dataurl....')).json(). Since the deserializer for json here is async it might not block the main thread... Long shot though :) – Thomas Frank May 03 '23 at 09:32
  • I've tried using JSON serializer/deserializer instead of the default `structuredClone`, however, it proved being less efficient, like twice as long. Blocking is not the main problem, as 20ms is a huge amount of time, blocking or not. Especially for an object that already has the right structure. – Raphallal May 03 '23 at 09:36
  • Ok. I thought blocking would be the main problem from the users perspective – Thomas Frank May 03 '23 at 09:36
  • I don't think you can improve this much, you can use a SharedArrayBuffer but that is very low level and would probably take as much time to 'unpack/pack' your data structure to. If this is a process that runs several times and the data fetched only have minor changes between runs you could use some json-diff library and only send the diff from the worker on second run.... But apart from that idea ... and my previous one I can't come up with anything. – Thomas Frank May 03 '23 at 09:44
  • Unfortunately, I've tried this one too ... And as you said, the 'unpack/pack' took even more time than the default message transferring system. – Raphallal May 03 '23 at 09:47
  • Just curious - does this has any impact on user experience or is just something you want to improve because it irritates you? :) – Thomas Frank May 03 '23 at 09:50
  • This MIGHT be interesting, a json serialization/deserialization library that is faster than native JSON methods (according to the author): https://pouyae.medium.com/sia-an-ultra-fast-serializer-in-pure-javascript-394a5c2166b8 But I guess it won't beat structuredClone according to your previous comparison. Still you could give it a go. – Thomas Frank May 03 '23 at 09:57
  • @ThomasFrank: well, it has an impact on user experience as we can receive multiple answers at once, hence have to do 100ms of processing and slow down some caching process. I've seen this but no breakthrough unfortunately. – Raphallal May 03 '23 at 10:02
  • ... Reading again your question I have the feeling I misread it at first. Could you please clarify where the data comes from, what's being done exactly by the Worker and what needs to be available in each thread please? – Kaiido May 04 '23 at 03:00
  • @Kaiido: I edited my message trying to be more understandable. – Raphallal May 04 '23 at 07:08
  • Thanks, could you clarify what is "post processing" in step 3, and how much time this takes compared to the cloning steps (both from the raw Response and from the main thread)? I.E Is the Worker really needed? And does the main thread really need the full data to be in the Map? Couldn't this Map live in the Worker instead? Your UI thread doing only requests to the Worker in order to get only the data it needs (e.g if you're showing a table, the Worker would return just he data of the table's current page, or if you show a plot, the Worker would return just the x,y coords of pts etc. – Kaiido May 04 '23 at 07:36

0 Answers0