215

I'm writing a Web application that needs to store JSON data in a small, fixed-size server-side cache via AJAX (think: Opensocial quotas). I do not have control over the server.

I need to reduce the size of the stored data to stay within a server-side quota, and was hoping to be able to gzip the stringified JSON in the browser before sending it up to the server.

However, I cannot find much in the way of JavaScript implementations of Gzip. Any suggestions for how I can compress the data on the client side before sending it up?

Roy Tinker
  • 10,044
  • 4
  • 41
  • 58
David Citron
  • 43,219
  • 21
  • 62
  • 72
  • 7
    You are sending it *up* to the server. That's why there are the notions of "upload" and "download". Maybe that's why you are getting answers that tell you "the server can do it". – Tomalak Nov 16 '08 at 20:27
  • 3
    A proper implementation of this is probably tricky, since javascript is single threaded. It would probably have to compress in batches, using setTimeout(), so that the UI doesn't lock up while compressing. – August Lilleaas Sep 15 '09 at 17:47
  • perhaps you could write your own compression algorithm – Captain kurO Jul 06 '12 at 03:43
  • 3
    @AugustLilleaas now you can use webworkers to do this :) – Captain Obvious Nov 19 '14 at 10:24

9 Answers9

149

Edit There appears to be a better LZW solution that handles Unicode strings correctly at http://pieroxy.net/blog/pages/lz-string/index.html (Thanks to pieroxy in the comments).


I don't know of any gzip implementations, but the jsolait library (the site seems to have gone away) has functions for LZW compression/decompression. The code is covered under the LGPL.

// LZW-compress a string
function lzw_encode(s) {
    var dict = {};
    var data = (s + "").split("");
    var out = [];
    var currChar;
    var phrase = data[0];
    var code = 256;
    for (var i=1; i<data.length; i++) {
        currChar=data[i];
        if (dict[phrase + currChar] != null) {
            phrase += currChar;
        }
        else {
            out.push(phrase.length > 1 ? dict[phrase] : phrase.charCodeAt(0));
            dict[phrase + currChar] = code;
            code++;
            phrase=currChar;
        }
    }
    out.push(phrase.length > 1 ? dict[phrase] : phrase.charCodeAt(0));
    for (var i=0; i<out.length; i++) {
        out[i] = String.fromCharCode(out[i]);
    }
    return out.join("");
}

// Decompress an LZW-encoded string
function lzw_decode(s) {
    var dict = {};
    var data = (s + "").split("");
    var currChar = data[0];
    var oldPhrase = currChar;
    var out = [currChar];
    var code = 256;
    var phrase;
    for (var i=1; i<data.length; i++) {
        var currCode = data[i].charCodeAt(0);
        if (currCode < 256) {
            phrase = data[i];
        }
        else {
           phrase = dict[currCode] ? dict[currCode] : (oldPhrase + currChar);
        }
        out.push(phrase);
        currChar = phrase.charAt(0);
        dict[code] = oldPhrase + currChar;
        code++;
        oldPhrase = phrase;
    }
    return out.join("");
}
Matthew Crumley
  • 101,441
  • 24
  • 103
  • 129
  • 2
    How can the code be LGPL if the algorithm is patented? Or are all patents truly expired? – David Citron Nov 16 '08 at 22:05
  • 11
    According to Wikipedia, the patents expired a few years ago. It might be a good idea to check that out though. – Matthew Crumley Nov 16 '08 at 22:39
  • 3
    LZW is way too old to still be patented. Last patents ran out in 2003 or so. There are loads of free implementations. – ypnos Nov 16 '08 at 22:39
  • 5
    I see at least two problems with the code above: 1) try to compress "Test to compress this \u0110\u0111\u0112\u0113\u0114 non ascii characters.", 2) No error is reported if code > 65535. – some Nov 17 '08 at 05:40
  • And I forgot the third one: The output from encode is in UTF-16. Does your application handle that? – some Nov 17 '08 at 05:47
  • Here is some info on how to compress Unicode: http://unicode.org/faq/compression.html. Looks if this was not so trivial. – Tomalak Nov 18 '08 at 07:07
  • There's a different LZW implmentation at http://zapper.hodgers.com/files/javascript/lzw_test/lzw.js I've no idea whether this addresses any of the above concerns. Also see the related blog post: http://zapper.hodgers.com/labs/?p=90 – msanders Jan 08 '09 at 11:27
  • FWIW -- I don't think the zapper.hodgers.com implementation addresses the problems described above. It worked fine with 'plain old ASCII', but when I tried it on a string generated by the HTML canvas toDataUrl() method, for example, the compressed-then-decompressed string didn't match the original. Has anyone implemented JavaScript compression *and* decompression in a way that addresses the issues above *and* can cope performance-wise with strings up to about 500K in length -- I realise this is a tall order! – Sam Dutton Aug 05 '10 at 11:37
  • @Sam - try utf8_encode(lzw_encode(my_string)). Here's a UTF8 encoder in Javascript: http://farhadi.ir/works/utf8. – Roy Tinker Oct 27 '10 at 18:56
  • 6
    Here are implementations in 21 different languages http://rosettacode.org/wiki/LZW_compression it's written that it's in public domain from 2004. – jcubic Feb 20 '11 at 22:08
  • Has anyone written a compression algorithm that works properly yet? – Olivier Lalonde Apr 27 '13 at 02:50
  • I couldn't find jsolait library in the link, can I just use the code of answer in my free website app? – simo May 01 '13 at 03:37
  • @simo It looks like the site has been taken down (the content at least). Yes, the LGPL lets you use the code as-is in your application. – Matthew Crumley May 01 '13 at 12:57
  • 6
    @some I just released a small lib correcting exactly the problems you're pointing out here: http://pieroxy.net/blog/pages/lz-string/index.html – pieroxy May 09 '13 at 10:32
  • @pieroxy Thanks, I added a link to your site. – Matthew Crumley May 09 '13 at 13:40
  • @Matthew Crumley thanks. I'm using the lib for a home-grown RSS reader and it seems pretty stable. I've compressed (and successfully decompressed) several dozens of megabytes so far with no issue in sight. I'm already thinking of adding some Huffman to properly encode the tokens generated by LZW but I like the speed of it, so I don't know where I'm going to go with this yet. – pieroxy May 13 '13 at 08:29
  • Nice - http://jsfiddle.net/lordloh/uJj7d/ – Lord Loh. Feb 28 '14 at 05:06
  • it didn't work correctly in my test - https://jsfiddle.net/4rh4js8h/1/ – hienbt88 Apr 17 '17 at 08:30
  • @hienbt88 It looks like you're running into the Unicode issue with the second "-" that's not a regular hyphen character. I would try pieroxy's code, which should be able to handle non-ASCII characters. – Matthew Crumley Apr 17 '17 at 14:09
60

I had another problem, I did not want to encode data in gzip but to decode gzipped data. I am running javascript code outside of the browser so I need to decode it using pure javascript.

It took me some time but i found that in the JSXGraph library there is a way to read gzipped data.

Here is where I found the library: http://jsxgraph.uni-bayreuth.de/wp/2009/09/29/jsxcompressor-zlib-compressed-javascript-code/ There is even a standalone utility that can do that, JSXCompressor, and the code is LGPL licencied.

Just include the jsxcompressor.js file in your project and then you will be able to read a base 64 encoded gzipped data:

<!doctype html>
</head>
<title>Test gzip decompression page</title>
<script src="jsxcompressor.js"></script>
</head>
<body>
<script>
    document.write(JXG.decompress('<?php 
        echo base64_encode(gzencode("Try not. Do, or do not. There is no try.")); 
    ?>'));
</script>
</html>

I understand it is not what you wanted but I still reply here because I suspect it will help some people.

pcans
  • 7,611
  • 3
  • 32
  • 27
  • 3
    Thank you alot for still sharing. This is exactly what I needed. You probably saved me hours of unsuccessful searching which I really can't spare. +1 – Kiruse Jun 25 '12 at 17:48
  • 1
    I wonder why on earth it is called "compressor" when it is an UNcompressor. lol – matteo May 07 '13 at 18:54
  • 1
    almost 5 years later, still useful. thank you. I'm dumping a large JSON directly to the page, instead of AJAX'ing it. by pre-compressing it with PHP and decompressing it back in JavaScript's client side - I'm saving some of the overhead. –  Dec 05 '14 at 19:05
  • Do we need the ` – Jus12 Jul 07 '16 at 07:00
  • i get `14:16:28.512 TypeError: e.replace is not a function[Weitere Informationen] jsxcompressor.min.js:19:12201` – Bluscream Jun 22 '18 at 12:17
  • THANK YOU! It really help me, I've been searching for the solution for 2 days! Again, thank you so much... – Yohannes Kristiawan Sep 07 '18 at 12:58
43

We just released pako https://github.com/nodeca/pako , port of zlib to javascript. I think that's now the fastest js implementation of deflate / inflate / gzip / ungzip. Also, it has democratic MIT licence. Pako supports all zlib options and its results are binary equal.

Example:

var inflate = require('pako/lib/inflate').inflate; 
var text = inflate(zipped, {to: 'string'});
Suzanne Soy
  • 3,027
  • 6
  • 38
  • 56
Vitaly
  • 3,340
  • 26
  • 21
17

I ported an implementation of LZMA from a GWT module into standalone JavaScript. It's called LZMA-JS.

14

Here are some other compression algorithms implemented in Javascript:

Mauricio Scheffer
  • 98,863
  • 23
  • 192
  • 275
9

I did not test, but there's a javascript implementation of ZIP, called JSZip:

https://stuk.github.io/jszip/

Joel
  • 678
  • 8
  • 17
Sirber
  • 3,358
  • 5
  • 25
  • 32
  • 1
    That's zip, not gzip, and it uses pako under the hood. Difference is that zip has file info metadata. – Vitaly Jan 03 '16 at 00:03
0

I guess a generic client-side JavaScript compression implementation would be a very expensive operation in terms of processing time as opposed to transfer time of a few more HTTP packets with uncompressed payload.

Have you done any testing that would give you an idea how much time there is to save? I mean, bandwidth savings can't be what you're after, or can it?

Tomalak
  • 332,285
  • 67
  • 532
  • 628
-6

Most browsers can decompress gzip on the fly. That might be a better option than a javascript implementation.

-6

You can use a 1 pixel per 1 pixel Java applet embedded in the page and use that for compression.

It's not JavaScript and the clients will need a Java runtime but it will do what you need.

Bogdan
  • 3,055
  • 1
  • 22
  • 20
  • 7
    Interesting, but I'd rather avoid including an applet if possible. – David Citron Nov 16 '08 at 20:53
  • I'd like to add the real use cases – cmc Jul 30 '12 at 16:00
  • 3
    Not a good solution as it adds a dependency to Java. Apart from that, not everyone has bothered to install java - the site won't work for some people. Personally I have java installed since I needed it for something a long time ago, but I prefer to visit sites that don't use java. – Onkelborg Feb 11 '13 at 14:22