49

I am working on web workers and I am passing large amount of data to web worker, which takes a lot of time. I want to know the efficient way to send the data.

I have tried the following code:

var worker = new Worker('js2.js');
worker.postMessage( buffer,[ buffer]);
worker.postMessage(obj,[obj.mat2]);
if (buffer.byteLength) {
  alert('Transferables are not supported in your browser!');
}
Artyom Neustroev
  • 8,627
  • 5
  • 33
  • 57
vicky
  • 673
  • 1
  • 8
  • 14
  • 1
    While it's for Web Worker => Browser, [in this RedHat article](http://developerblog.redhat.com/2014/05/20/communicating-large-objects-with-web-workers-in-javascript/) there's a nice explanation. Basically, you need to break your buffer in several blocks and pass each around. Other option (also in that link) is using FileReader. I open a bounty in your question though since I'm also interested. – Francisco Presencia May 18 '15 at 05:50
  • Maybe you should only transfer that buffer instead of also serialising it? – Bergi May 18 '15 at 06:25
  • Transferables are to be passed as the third argument, not the 2nd... – dandavis May 20 '15 at 19:38
  • @dandavis aren't either of the two `worker.postMessage(arrayBuffer, [arrayBuffer]);` `window.postMessage(arrayBuffer, targetOrigin, [arrayBuffer]);` perfectley valid (source: http://www.html5rocks.com/en/tutorials/workers/basics/#toc-transferrables) – John May 21 '15 at 07:05
  • Here's an honest short answer: 1)You can't really do that yet, you can't _share_ data between workers easily (but you can transfer it if that's enough) and b) https://groups.google.com/a/chromium.org/forum/#!topic/blink-dev/d-0ibJwCS24 – Benjamin Gruenbaum May 21 '15 at 10:16
  • Here is how it's going to be solved: https://docs.google.com/document/d/1NDGA_gZJ7M7w1Bh8S0AoDyEqwDdRh4uSoTPSNn77PFk/edit - we're not there yet, but hold tight. – Benjamin Gruenbaum May 21 '15 at 10:17
  • @John: yeah, i just realized workers use different arity... – dandavis May 21 '15 at 19:13
  • Does this answer your question? [Using transferable objects from a Web Worker](https://stackoverflow.com/questions/16071211/using-transferable-objects-from-a-web-worker) – Klesun Feb 16 '20 at 18:26

3 Answers3

46

UPDATE

Modern versions of Chrome, Edge, and Firefox now support SharedArrayBuffers (though not safari at the time of this writing see SharedArrayBuffers on MDN), so that would be another possibility for a fast transfer of data with a different set of trade offs compared to a transferrable (you can see MDN for all the trade offs and requirements of SharedArrayBuffers).

UPDATE:

According to Mozilla the SharedArrayBuffer has been disabled in all major browsers, thus the option described in the following EDIT does no longer apply.

Note that SharedArrayBuffer was disabled by default in all major browsers on 5 January, 2018 in response to Spectre.

EDIT: There is now another option and it is sending a sharedArray buffer. This is part of ES2017 under shared memory and atomics and is now supported in FireFox 54 Nightly. If you want to read about it you can look here. I will probably write up something some time and add it to my answer. I will try and add to the performance benchmark as well.

To answer the original question:

I am working on web workers and I am passing large amount of data to web worker, which takes a lot of time. I want to know the efficient way to send the data.

The alternative to @MichaelDibbets answer, his sends a copy of the object to the webworker, is using a transferrable object which is zero-copy.

It shows that you were intending to make your data transferrable, but I'm guessing it didn't work out. So I will explain what it means for some data to be transferrable for you and future readers.

Transferring objects "by reference" (although that isn't the perfect term for it as explained in the next quote) doesn't just work on any JavaScript Object. It has to be a transferrable data-type.

[With Web Workers] Most browsers implement the structured cloning algorithm, which allows you to pass more complex types in/out of Workers such as File, Blob, ArrayBuffer, and JSON objects. However, when passing these types of data using postMessage(), a copy is still made. Therefore, if you're passing a large 50MB file (for example), there's a noticeable overhead in getting that file between the worker and the main thread.

Structured cloning is great, but a copy can take hundreds of milliseconds. To combat the perf hit, you can use Transferable Objects.

With Transferable Objects, data is transferred from one context to another. It is zero-copy, which vastly improves the performance of sending data to a Worker. Think of it as pass-by-reference if you're from the C/C++ world. However, unlike pass-by-reference, the 'version' from the calling context is no longer available once transferred to the new context. For example, when transferring an ArrayBuffer from your main app to Worker, the original ArrayBuffer is cleared and no longer usable. Its contents are (quiet literally) transferred to the Worker context.

- Eric Bidelman Developer at Google, source: html5rocks

The only problem is there are only two things that are transferrable as of now. ArrayBuffer, and MessagePort. (Canvas Proxies are hopefully coming later). ArrayBuffers cannot be manipulated directly through their API and should be used to create a typed array object or a DataView to give a particular view into the buffer and be able to read and write to it.

From the html5rocks link

To use transferrable objects, use a slightly different signature of postMessage():

worker.postMessage(arrayBuffer, [arrayBuffer]);

window.postMessage(arrayBuffer, targetOrigin, [arrayBuffer]);

The worker case, the first argument is the data and the second is the list of items that should be transferred. The first argument doesn't have to be an ArrayBuffer by the way. For example, it can be a JSON object:

worker.postMessage({data: int8View, moreData: anotherBuffer}, [int8View.buffer, anotherBuffer]);

So according to that your

var worker = new Worker('js2.js');
worker.postMessage(buffer, [ buffer]);
worker.postMessage(obj, [obj.mat2]);

should be performing at great speeds and should be being transferred zero-copy. The only problem would be if your buffer or obj.mat2 is not an ArrayBuffer or transferrable. You may be confusing ArrayBuffers with a view of a typed array instead of what you should be using its buffer.

So if you have this ArrayBuffer and it's Int32 representation. (though the variable is titled view it is not a DataView, but DataView's do have a property buffer just as typed arrays do. Also at the time this was written the MDN use the name 'view' for the result of calling a typed arrays constructor so I assumed it was a good way to define it.)

var buffer = new ArrayBuffer(90000000);
var view = new Int32Array(buffer);
for(var c=0;c<view.length;c++) {
    view[c]=42;
}

This is what you should not do (send the view)

worker.postMessage(view);

This is what you should do (send the ArrayBuffer)

worker.postMessage(buffer, [buffer]);

These are the results after running this test on plnkr.

Average for sending views is 144.12690000608563
Average for sending ArrayBuffers is 0.3522000042721629

EDIT: As stated by @Bergi in the comments you don't need the buffer variable at all if you have the view, because you can just send view.buffer like so

worker.postMessage(view.buffer, [view.buffer]);

Just as a side note to future readers just sending an ArrayBuffer without the last argument specifying what the ArrayBuffers are you will not send the ArrayBuffer transferrably

In other words when sending transferrables you want this:

worker.postMessage(buffer, [buffer]);

Not this:

worker.postMessage(buffer);

EDIT: And one last note since you are sending a buffer don't forget to turn your buffer back into a view once it's received by the webworker. Once it's a view you can manipulate it (read and write from it) again.

And for the bounty:

I am also interested in official size limits for firefox/chrome (not only time limit). However answer the original question qualifies for the bounty (;

As to a webbrowsers limit to send something of a certain size I am not completeley sure, but from that quote that entry on html5rocks by Eric Bidelman when talking about workers he did bring up a 50 mb file being transferred without using a transferrable data-type in hundreds of milliseconds and as shown through my test in a only around a millisecond using a transferrable data-type. Which 50 mb is honestly pretty large.

Purely my own opinion, but I don't believe there to be a limit on the size of the file you send on a transferrable or non-transferrable data-type other than the limits of the data type itself. Of course your biggest worry would probably be for the browser stopping long running scripts if it has to copy the whole thing and is not zero-copy and transferrable.

Hope this post helps. Honestly I knew nothing about transferrables before this, but it was fun figuring out them through some tests and through that blog post by Eric Bidelman.

John
  • 7,114
  • 2
  • 37
  • 57
  • 2
    Alternatively send `view.buffer`, you don't need to have an extra variable for that :-) – Bergi May 21 '15 at 20:59
  • Thanks @Bergi I've updated my answer to show that as a better alternative. – John May 21 '15 at 21:47
  • Buffers can also now be shared (see my comments on the question) but only in Chrome. – Benjamin Gruenbaum May 21 '15 at 21:49
  • @BenjaminGruenbaum Nice. I watched a talk by Brendan Eich about the future of javascript (https://www.youtube.com/watch?v=6AytbSdWBKg), and I swear it said something about they were planning on adding sharing between webworkers. That would make it alot simpler once they add that. I'll try researching it a little bit later and update my answer. – John May 21 '15 at 21:53
  • I'm glad that you also learned about it besides giving a great answer. If it's okay, I will edit it later because there are parts that were unclear for me in this answer that were more clear in @MichaelDibbets answer (although this answer is much more comprehensible and that's why you get the bounty). – Francisco Presencia May 25 '15 at 06:29
  • 1
    Thanks for the clear explanation. You cleared up a part for me that never got clear out of all the online manuals, that the buffer plays a role and what role it plays. – Tschallacka Jul 03 '17 at 07:53
  • 3
    Unfortunately SharedArrayBuffer was _disabled by default in all major browsers on 5 January, 2018 in response to Spectre._ You might want to amend that edit. – bvanlew Feb 26 '18 at 08:15
12

I had issues with webworkers too, until I just passed a single argument to the webworker.

So instead of

worker.postMessage( buffer,[ buffer]);
worker.postMessage(obj,[obj.mat2]);

Try

var myobj = {buffer:buffer,obj:obj};
worker.postMessage(myobj);

This way I found it gets passed by reference and its insanely fast. I post back and forth over 20.000 dataelements in a single push per 5 seconds without me noticing the datatransfer. I've been exclusively working with chrome though, so I don't know how it'll hold up in other browsers.

Update

I've done some testing for some stats.

tmp = new ArrayBuffer(90000000);
test = new Int32Array(tmp);
for(c=0;c<test.length;c++) {
    test[c]=42;
}
for(c=0;c<4;c++) {
    window.setTimeout(function(){
        // Cloning the Array. "We" will have lost the array once its sent to the webworker. 
        // This is to make sure we dont have to repopulate it.
        testsend = new Int32Array(test);
        // marking time. sister mark is in webworker
        console.log("sending at at  "+window.performance.now());
        // post the clone to the thread.
        FieldValueCommunicator.worker.postMessage(testsend);
    },1000*c);
}

results of the tests. I don't know if this falls in your category of slow or not since you did not define "slow"

  • sending at at 28837.418999988586
  • recieved at 28923.06199995801
  • 86 ms


  • sending at at 212387.9840001464

  • recieved at 212504.72499988973
  • 117 ms


  • sending at at 247635.6210000813

  • recieved at 247760.1259998046
  • 125 ms


  • sending at at 288194.15999995545

  • recieved at 288304.4079998508
  • 110 ms
Tschallacka
  • 27,901
  • 14
  • 88
  • 133
  • 1
    I would say 86 ms for the data that could have been passed by reference _is_ slow. – Klesun Feb 16 '20 at 16:03
  • @arthur accounting for scheduling, thread sync, decoupling I think it's pretty fast. Also I'm transferring 90000000 * 32 bits = 2880000000 bits = 360000000 bytes = 351562.5kb = 343mb of data. The pc I did this on had ddr2 with 5.3gb data transfer rate. At full uninterupted access the fastest it could have reallocated is 64ms. But since there are other processes needing access you need to wait. Hence the delays and this speed. It would be faster on pcs with faster RAM and CPU and bus speeds – Tschallacka Feb 16 '20 at 17:43
  • Going from the reasonable assumption all javascript varables have their own wrappers assigning it to a thread, scope etc, that needs to be updated. A reference from JavaScript viewpoint, but from runtime endpoint a shitton of data that needs to be reallocated – Tschallacka Feb 16 '20 at 17:45
  • 1
    Yeah, I mean, if it were actually passed by reference, there would be a 0 ms delay, since you would not need to copy anything. You don't normaly wait 86 ms when you are passing a variable to a function, right? The [Transferable](https://developer.mozilla.org/en-US/docs/Web/API/Worker/postMessage) for the array buffer seems to be the most efficient solution to OP's problem. Sorry for necromancy, your "I don't know if this falls in your category of slow or not" just triggered me to voice my opinion ;-) – Klesun Feb 16 '20 at 18:20
0

It depends on how large the data is

I found this article that says, the better strategy is to pass large data to a web worker and back in small bits. In addition, it also discourages the use of ArrayBuffers.

Please have a look: https://developers.redhat.com/blog/2014/05/20/communicating-large-objects-with-web-workers-in-javascript

apiyo
  • 103
  • 7
  • 1
    That article is from 2014, it's utterly outdated. Passing ArrayBuffers (not even necessarily transferred) is actually a lot faster than strings nowadays. – Kaiido Oct 11 '22 at 05:33