25

I have a web application(ruby on rails) that sends some YAML as the value of a hidden input field.

Now I want to reduce the size of the text that is sent across to the browser. What is the most efficient form of lossless compression that would send across minimal data? I'm ok to incur additional cost of compression and decompression at the server side.

gnarsi
  • 369
  • 1
  • 3
  • 8

2 Answers2

66

You could use the zlib implementation in the ruby core to in/de-flate data:

require "zlib"
data = "some long yaml string" * 100
compressed_data = Zlib::Deflate.deflate(data)
#=> "x\x9C+\xCE\xCFMU\xC8\xC9\xCFKW\xA8L\xCC\xCDQ(.)\xCA\xCCK/\x1E\x15\x1C\x15\x1C\x15\x1C\x15\x1C\x15\x1C\x15\x1C\x15\x1C\x15D\x15\x04\x00\xB3G%\xA6"

You should base64-encode the compressed data to make it printable:

require 'base64'
encoded_data = Base64.encode64 compressed_data
#=> "eJwrzs9NVcjJz0tXqEzMzVEoLinKzEsvHhUcFRwVHBUcFRwVHBUcFUQVBACz\nRyWm\n"

Later, on the client-side, you might use pako (a zlib port to javascript) to get your data back. This answer probably helps you with implementing the JS part.

To give you an idea on how effective this is, here are the sizes of the example strings:

data.size            # 2100
compressed_data.size #   48
encoded_data.size    #   66

Same thing goes vice-versa when compressing on the client and inflating on the server.

Zlib::Inflate.inflate(Base64.decode64(encoded_data))
#=> "some long yaml stringsome long yaml str ... (shortened, as the string is long :)

Disclaimer:

  • The ruby zlib implementation should be compatible with the pako implementation. But I have not tried it.
  • The numbers about string sizes are a little cheated. Zlib is really effective here, because the string repeats a lot. Real life data usually does not repeat as much.
Community
  • 1
  • 1
tessi
  • 13,313
  • 3
  • 38
  • 50
  • 3
    It seams I have down-voted this by accident few days ago as I don't recall doing it. If you please, make an edit so I can retract my accidental down-vote :( – Krule Sep 19 '16 at 08:34
  • 4
    @Krule thanks for being nice. First I wasn't sure if I find a useful update, but then I stumbled over pako which (it seems) is a far better zlib js library than zlib. So thanks for the reminder to look at my answer again, I could actually improve it :) – tessi Sep 19 '16 at 09:13
  • 7
    Please note that the output of `Zlib::Deflate.deflate` is not compatible with the format that the `gzip` command line utility generates and will not be accepted by `gunzip`, which expects some header data before the compressed content. If you want to read the output using `gunzip`, the following code will be useful: `Zlib::Deflate.new(nil, 31).deflate(data, Zlib::FINISH)` – Guss Oct 03 '16 at 09:18
  • 1
    that is very efficient due to the example data set, which duplicates the test many times, in other examples it is less efficient. – Ian Vaughan Nov 04 '17 at 23:25
0

If you are working on a Rails application, you can also use the ActiveSupport::Gzip wrapper that allows compression/decompression of strings with gzip.

compressed_log = ActiveSupport::Gzip.compress('large string')
=> "\x1F\x8B\b\x00yq5c\x00\x03..."

original_log = ActiveSupport::Gzip.decompress(compressed_log)
=> "large string"

Behind the scenes, the compress method uses the Zlib::GzipWriter class which writes gzipped files. Similarly, the decompress method uses Zlib::GzipReader class which reads a gzipped file.

software_writer
  • 3,941
  • 9
  • 38
  • 64