2

I'm trying to use Web Workers to process large volumes of data, and when passing data back to the main thread for display, I would like to use a transferable object to reduce the impact on the UI thread.

The procedure currently results in a multi dimensional array that can also contain objects. For instance:

[{foo: [{bar: "Alice",
         car: 23,
         dab: [2, 3, 5]}],
  faa: [{moo: {a: [2,3], b: [4,5]} },
        {moo: {a: [6,7], b: [8,9]} }]},
 {foo: [{bar: "John",
         car: 33,
         dab: [6, 7, 1]}],
  faa: [{moo: {a: [5,5], b: [9,2]} },
        {moo: {a: [7,7], b: [4,2]} }]},
 ...]

I have seen this string conversion post, but again, I can't see how to directly apply this to my array structure: Converting between strings and ArrayBuffers

Appreciate the help!

Community
  • 1
  • 1
Kaiesh
  • 1,042
  • 2
  • 14
  • 21
  • Can you just use `JSON.stringify()` ? – Jack Allan Apr 23 '14 at 07:44
  • But then there is no benefit to the transferable object, as I will need to deserialise it on the main thread. The standard message passing functionality does this automatically if I am not mistaken... The benefits of transferable objects is that the main thread does not need to do any work to use the object that has been created. – Kaiesh Apr 23 '14 at 11:44
  • Did you ever find a solution to this problem? – Nick Jennings Mar 26 '15 at 14:51
  • Unfortunately not - everything seems to be focused on converting from strings - which means that deserialization still happens on the main thread. My workers currently just pass the JS obj in the post-message and so this is taken care of by the browser itself, but I have not identified how to copy large working objects over. – Kaiesh Mar 26 '15 at 20:50
  • @Kaiesh I am working on the same problem. Have you found any solution.... Actually i am doing matrix multiplication. I want the each webworker result should stored in the main... – Baran Jun 04 '15 at 18:30
  • @Baran I have not! I imagine that in your scenario the benefits of matrix multiplication being pushed to the background would far outweigh the overhead of deserialisation on the main thread though. As soon as I find a working solution I will be sure to post it here! Keep it starred until then I would say. – Kaiesh Jun 06 '15 at 06:46

1 Answers1

1

Lot's of people have problem understanding this. So let me give you a image of your options and what they do:

(a) Using plain postMessage with your data

var object = { ... };
worker.postMessage(object);
  1. [Main thread] Creates structured clone object
  2. [Main thread] Recursively copies data from object to structured clone
  3. [Main thread] Posts the object to the [Worker]
  4. [Worker] Create new object from structured clone.
  5. [Worker] Dispatch new message with object as parameter

Note that creating and parsing structured clone is done by optimized native code.

(b) Converting data to transferable

var object = { ... };
var binary = CreateTypedArrayFromObject(object);
worker.postMessage(binary.buffer, [binary.buffer]);
  1. [Main thread] Runs slow javascript code to convert object to TypedArray
  2. [Main thread] Which involves either calculating object size first, or creating many typed arrays and concatenating them
  3. [Main thread] Moves the ArrayBuffer of the TypedArray to the [Worker]
  4. [Worker] Receive ArrayBuffer
  5. [Worker] Dispatch new message with object as parameter
  6. [Worker] Run javascript code to create new object, discarding received array buffer

What I'm pointing out is that you wanted to avoid copy, but you're still making a copy, only this time it's not native but javascript copy. If you want to optimize, you have to design your data structure so that it operates on typed arrays. If it doesn't, just don't even try to use them - you will just add extra overhead to your code.

Tomáš Zato
  • 50,171
  • 52
  • 268
  • 778
  • Tomas - my intent is to perform the conversion on the worker thread so that there is no impact to the main thread. My worker threads currently do all the XHR work, and manipulation in preparation for presentation, and the last bit of optimisation would be to prevent JSON parsing or object cloning of large volumes of data. You reference a method `CreateTypedArrayFromObject` - do you have an implementation of this anywhere? – Kaiesh Nov 12 '15 at 19:06
  • Yeah I actually managed to, sort of, serialize whole Window object as a binary stream. But you still don't ubderstand what's going on here, even after the step by step explanation. If you do whatever conversion between ArrayBuffer and javascript object, you're just slowing the code down, because the native browser structured clone algorithm is gonna beat you. You mentioned JSON, but srructured clone is not JSON. It converting JS object to binary and moving the clone to worker. You're now trying to re-implement it because you don't even know it exists. – Tomáš Zato Nov 12 '15 at 22:48
  • PS.: Here's an answer of mine clearly demonstrating how insignificant the difference between transfer and copy is: http://stackoverflow.com/a/33309300/607407 And this is still without the JS clone algorithm overhead. – Tomáš Zato Nov 12 '15 at 22:52
  • I also would like to know what CreateTypedArrayFromObject does. :) – arpo Nov 11 '16 at 09:52
  • @arpo It's abstract way to express "*some function that converts JS object to byte array*". You could, for example, convert the object to JSON and convert that to bytes. I wrote some more complex set of fuctions to do that, but they're kinda slow, hard to understand and produce bigger binary. You can see some example here: http://fel.8u.cz/jsbin.html It can serialize circular references and some functions. – Tomáš Zato Nov 11 '16 at 11:05
  • @tomáš-zato thanks, I just posted this http://stackoverflow.com/questions/40545604/posting-objects-to-web-worker-using-javascript So should it work to make my object array in my question to a blob and send it to the worker? – arpo Nov 11 '16 at 11:12
  • @arpo Please read my answer here more carefully. What are you trying to achieve is futile. Just use post message without using transferable objects. The only reason to use transferable is when your data already IS binary buffer. My answer and comments here explain it. And this is not the only my answer on the topic. I strongly discourage you from using my experimental library in production code. Post your array as normal message. – Tomáš Zato Nov 11 '16 at 11:32
  • 1
    @tomáš-zato thanks for the insights - but you could actually write without all the disdain. The best recommendation you have given here (once all the angst is removed) is that if data by transferable object is required then the best thing do is to use typed arrays within the main thread - as native copy will always be faster. Understand why the question is important for some - as we're operating on very large volumes of data that need to be manipulated, and the copy process in any form is not fast enough. – Kaiesh Dec 08 '16 at 03:22
  • @Kaiesh I see no disdain. I have written the question from perspective of someone, who has to explain the same trivial thing over and over. So this time, I ensured that I'm making strong point, both factually and emotionally. – Tomáš Zato Dec 10 '16 at 13:10
  • @Kaiesh Have a look at https://www.npmjs.com/package/typeson – Sarath Nov 17 '20 at 20:43