0

I am converting a document to base64 on the fly in my javascript client module. The converted base64 string is about 125% of the actual document size.

I need to compress the base64 string.

What is the best library\function for compression? Any examples\links will be a great help.

Thanks!

nihal
  • 357
  • 1
  • 3
  • 18
  • 7
    An excellent first step would be to decode it back from base64. – Ry- Jul 12 '17 at 13:45
  • To be exact the converted string should be about 133% of the document size - since Base64 encodes 3 bytes in 4 characters that's how it should be. – piet.t Jul 12 '17 at 13:49
  • right. By 125% I meant an approximate figure. – nihal Jul 12 '17 at 14:01
  • @Ryan Is there any other encoding apart from base64 that does encoding as well as reasonable compression? – nihal Jul 12 '17 at 14:02
  • @nihal: What are you trying to accomplish? – Ry- Jul 12 '17 at 14:55
  • i am converting a PDF document to base64. the PDF is approx 500KB in size. when I convert to base64, the encoded string is about 650KB. I want to see if there is any compression scheme\algorithm that can bring down the overall size of base64 string(or other encoded string) to less than that of the original PDF document. Eventually I plan to store the base64 equivalent of the document on blockchain. – nihal Jul 12 '17 at 15:12
  • 1
    Possibly related https://stackoverflow.com/questions/38124361/why-does-base64-encoded-data-compress-so-bad Looks like base64 compression isn't really possible. – Trevor Jul 12 '17 at 15:29
  • 1
    1.) You do Base64-encoding to store or transmit binary data where you can only use character-data. 2.) If you need some sort of compression then compress the original document with the alogrithm of your choice, then Base64-encode the compressed document. – piet.t Jul 13 '17 at 06:18
  • 1
    In my case I am trying to compress json with embedded documents. Compressing the docs before converting to base64 would make the customer unhappy. I am just curious what the best compression algorithm is to tackle this. Seems like it wouldn't be that hard to detect data like this, convert it on the fly to a more compressible stream (the original binary), and then be able to reverse that on the way out. – Chris Seline Jan 03 '19 at 19:59
  • The conclusion on our side was: Don't store the whole data on Blockchain/DLT rather just the hash of the PDF. The entire PDF document could be stored elsewhere. Storing only the signature of the data on Blockchain would also satisfy the need to verifying it later. Hence compression became secondary. However, this was a good discussion. Thanks for all the replies. – nihal Jul 20 '22 at 01:30

0 Answers0