Compressing JSON by replacing often used Strings?

Question

I work with JSON at the time and save some data. Since I could save a lot space by replacing often used strings I wonder if there is any algorithm out there which can do this. I prefer Javascript since I do it with JavaScript and NodeWebkit, but it would good to know if something like this exists. Because I do this with NodeWebkit the data is stored to the clients computer, so I have no server to communicate with. Additionally, it must be a standalone application, so I should not use external programs.

I imagine to get from this:

{
    "Attribute1" : "This is my very long string",
    "Attribute2" : "This is my very long string",
    "Attribute3" : {
         "innerObjectAttribute": "This string contains the word Attribute"
     }
}

Object something like:

{
    "$$1" : "Attribute",
    "$$2" : "This is my very long string",
    "data": {
           "$$11" : "$$2",
           "$$12" : "$$2",
           "$$13" : {
                 "innerObject$$1" : "This string contains the word $$1"
            }  
     } 
}

Already in this example the algorithm would save space (without spaces), but imagine a case where you use a long word - or a part of a path (which I do) multiple times - in my case it could save a lot (!!) space.

My old JSON-Object were just saved under the data attribute, all strings which were replaced come before that and have his own attribute - but only being once in the whole JSON-file.

Problem with Strings like $$1 when they are used by the user should be considered by the algorithm itself.

I imagine to get my input JSON-string back with a parse/undo function. Does anyone can help here?

well.. you need to write a library for it to transform and parse and we can definitely *help* in writing one. — Amit Joki, Jan 20 '15 at 14:52
If you're not sending/receiving data, why is it important to compress your JSON? Can you zip your content to `my_json.json.zip`? — Halcyon, Jan 20 '15 at 15:04
I want to do this because I could save very much space by compressing it like this. I think zipping could be kind of inneficient to the storage because I need to save (and load of course) the big file to then zip that thing... — Ba5t14n, Jan 20 '15 at 15:07

score 1 · Answer 1 · answered Jan 20 '15 at 14:59

1

This is, at its simplest form, the idea of every dictionary based compression (gzip, zip, deflate). Pretty much every webserver has a gzip/defalte module, just active it, gzip/deflate compression is specified in HTTP. The advantage is that gzip is way more sophistcated than your approach and is done transparently and only if the client can decompress it (which pretty much every http client can)

A example request looks like this:

GET /encrypted-area HTTP/1.1
Host: www.example.com
Accept-Encoding: gzip, deflate

Response

HTTP/1.1 200 OK
...
Accept-Ranges: bytes
Content-Length: 438
Content-Type: text/html; charset=UTF-8
Content-Encoding: gzip

answered Jan 20 '15 at 14:59

Patrick

33,984
10
106
126

Because I do this in NodeWebkit, I haven't any server. My application needs to be a standalone and the JSON is saved to the clients computer. Because I have a data which uses the same strings very often, I need such an algorithm for that – Ba5t14n Jan 20 '15 at 15:01
1

Well then the complexity of implementaiton wont justify the advantage of saving maybe 10-20%. If you still insinst use a compression js library see here http://stackoverflow.com/questions/294297/javascript-implementation-of-gzip. Do not reinvent the wheel. – Patrick Jan 20 '15 at 15:04

Compressing JSON by replacing often used Strings?

1 Answers1