40

I'm trying to upload large files (at least 500MB, preferably up to a few GB) using the WebSocket API. The problem is that I can't figure out how to write "send this slice of the file, release the resources used then repeat". I was hoping I could avoid using something like Flash/Silverlight for this.

Currently, I'm working with something along the lines of:

function FileSlicer(file) {
    // randomly picked 1MB slices,
    // I don't think this size is important for this experiment
    this.sliceSize = 1024*1024;  
    this.slices = Math.ceil(file.size / this.sliceSize);

    this.currentSlice = 0;

    this.getNextSlice = function() {
        var start = this.currentSlice * this.sliceSize;
        var end = Math.min((this.currentSlice+1) * this.sliceSize, file.size);
        ++this.currentSlice;

        return file.slice(start, end);
    }
}

Then, I would upload using:

function Uploader(url, file) {
    var fs = new FileSlicer(file);
    var socket = new WebSocket(url);

    socket.onopen = function() {
        for(var i = 0; i < fs.slices; ++i) {
            socket.send(fs.getNextSlice()); // see below
        }
    }
}

Basically this returns immediately, bufferedAmount is unchanged (0) and it keeps iterating and adding all the slices to the queue before attempting to send it; there's no socket.afterSend to allow me to queue it properly, which is where I'm stuck.

Vlad Ciobanu
  • 1,473
  • 1
  • 11
  • 11
  • 1
    Assuming I don't want to depend on Flash/Silverlight, what should I use? XMLHttpRequest? I was under the impression that WebSockets have less overhead. – Vlad Ciobanu Jun 18 '12 at 10:40
  • 2
    Websockets have less overhead for bidirectional communication, yes, but uploading a file is simply sending a POST request to a server with the file in the body. Browsers are very good at that and the overhead for a big file is really near nothing. – Denys Séguret Jun 18 '12 at 10:43
  • I was considering slicing it up in smaller bits. I guess I'll try slicing it using the File API and sending it using XMLHttpRequest, see how that goes. Thank you for your help. If you want to make an answer with the info above, and possibly any other advice I'd happily accept it as the answer. – Vlad Ciobanu Jun 18 '12 at 10:54
  • just replace xhr call with websocket send you can get large file upload http://stackoverflow.com/questions/5053290/large-file-upload-though-html-form-more-than-2-gb/10845664#10845664 – kongaraju Apr 08 '13 at 11:12
  • 1
    +1 u ever get this to work? –  May 27 '13 at 04:54
  • 2
    Yes, but I decided to use simple Ajax calls rather than WebSockets. The implementation is trivial, you just need to queue the next send() on the previous' complete. – Vlad Ciobanu May 29 '13 at 13:08
  • @dystroy, The advantage of Websockets is that you have more control. Like a download status (x% complete) bar. – Pacerier Mar 16 '15 at 13:31
  • Well... you don't need websockets for that ^^ – Denys Séguret Mar 16 '15 at 14:03

6 Answers6

16

Use web workers for large files processing instead doing it in main thread and upload chunks of file data using file.slice().

This article helps you to handle large files in workers. change XHR send to Websocket in main thread.

//Messages from worker
function onmessage(blobOrFile) {
 ws.send(blobOrFile);
}

//construct file on server side based on blob or chunk information.
kongaraju
  • 9,344
  • 11
  • 55
  • 78
  • 4
    Your solution is really slick. I tried it and it worked perfectly for large file sizes as 1Gb and up. I did it as a part of a unit test for websocket, however if someone wants it to reuse, then sources can be found there https://github.com/drogatkin/TJWS2/tree/master/1.x/test/html-js One draw back currently that all sends executed asynchronous, so you have no control when file sent completely. – Singagirl Mar 08 '15 at 04:06
  • The WS server can simply send back a message when the file is processed. It can even send messages during processing to effect a progress bar on the client, since the main thread is not blocked by the worker. – Dominic Cerisano Aug 25 '16 at 20:22
  • 1
    XHRs don't run on the main thread (unless explicitly set to run synchroneously), so threading is not a justification for using web workers. The difference is that XHR is in the window's context, it dies if the tab gets closed, while web workers can continue to run until the browser process terminates. You can use XHR as well as WebSocket in a web worker. – Daniel F May 11 '18 at 16:36
11

I believe the send() method is asynchronous which is why it will return immediately. To make it queue, you'd need the server to send a message back to the client after each slice is uploaded; the client can then decide whether it needs to send the next slice or a "upload complete" message back to the server.

This sort of thing would probably be easier using XMLHttpRequest(2); it has callback support built-in and is also more widely supported than the WebSocket API.

Graham
  • 6,484
  • 2
  • 35
  • 39
5

In order to serialize this operation you need the server to send you a signal every time a slice is received & written (or an error occurs), this way you could send the next slice in response to the onmessage event, pretty much like this:

function Uploader(url, file) {
    var fs = new FileSlicer(file);
    var socket = new WebSocket(url);

    socket.onopen = function() {
       socket.send(fs.getNextSlice());
    }
    socket.onmessage = function(ms){
        if(ms.data=="ok"){
           fs.slices--;
           if(fs.slices>0) socket.send(fs.getNextSlice());
        }else{
           // handle the error code here.
        }
    }
}
3

EDIT: The web world, browsers, firewalls, proxies, changed a lot since this answer was made. Right now, sending files using websockets can be done efficiently, especially on local area networks.


Original Answer:

Websockets are very efficient for bidirectional communication, especially when you're interested in pushing information (preferably small) from the server. They act as bidirectional sockets (hence their name).

Websockets don't look like the right technology to use in this situation. Especially given that using them adds incompatibilities with some proxies, browsers (IE) or even firewalls.

On the other end, uploading a file is simply sending a POST request to a server with the file in the body. Browsers are very good at that and the overhead for a big file is really near nothing. Don't use websockets for that task.

user513951
  • 12,445
  • 7
  • 65
  • 82
Denys Séguret
  • 372,613
  • 87
  • 782
  • 758
  • 39
    dystroy, your information is out of date. The standardized WebSocket protocol (IETF 6455) supports sending and receiving direct binary data (ArrayBuffer and Blob). You're thinking of the old Hixie protocol which only support sending UTF-8 data (which required encoding binary data). Also, the IETF 6455 version of the WebSocket protocol was specifically designed to inter-operate with existing proxies and firewalls. I have used WebSockets extensively and do not see the issues you imply. Please cite evidence that there are wide-spread problems. – kanaka Jun 18 '12 at 17:06
  • 3
    I won't say you're wrong on the IETF 6455 (especially given that searchs about this topic lead to your recent efforts to work on compatibility with this new norm in websockify), and this information is welcome, but the world isn't totally converted. See [this proxy problem](http://stackoverflow.com/questions/10947298/redirecting-websocket-traffic-on-port-80-with-lighttpd). Besides, look for "browser support" on [this page](http://en.wikipedia.org/wiki/WebSocket). And basically there is *no* reason to use websockets to upload a file. – Denys Séguret Jun 18 '12 at 17:12
  • 5
    If you remove the entire second paragraph then I have no problem with your answer but the second paragraph is mostly wrong. JSON is just one method of textual serializing/encoding and has nothing directly to do with WebSockets. Base64 is about 33% larger, but it is not CPU heavy (even doing it directly in Javascript). There are certainly buggy intermediaries but there is no widespread problem. The only in-the-wild major browser that still uses Hixie is iOS Safari (and it's possible that iOS 6 will change that). Chrome, Firefox, IE 10, Opera (there but disabled) all use IETF 6455. – kanaka Jun 18 '12 at 18:05
  • 4
    I never told about CPU. And I know you're a competent promoter of the new version of websockets but it's unfair for OP (**who just wants to upload a file**) to let think there is now no compatibility problem (I told about proxies, for exemple). – Denys Séguret Jun 18 '12 at 18:14
  • 5
    dystroy, please don't put words in my mouth. Your answer is fine, but your rationale is flawed. I did not say or imply that WebSockets is the better choice for large file uploads. If you address the issues I'll remove the downvote. Your edit did not improve the situation. And who is "katana"? – kanaka Jun 18 '12 at 20:15
3

You could use https://github.com/binaryjs/binaryjs or https://github.com/liamks/Delivery.js if you can run node.js on the server.

Ibrahim Muhammad
  • 2,808
  • 4
  • 29
  • 39
0

I think this socket.io project has a lot of potential:

https://github.com/sffc/socketio-file-upload

It supports chunked upload, progress tracking and seems fairly easy to use.

Kesarion
  • 2,808
  • 5
  • 31
  • 52