0

I have a string that I have compressed in javascript using lz-string.

compress.js

var compressed = LZString.compress('abc');

I can decompress it in javascript using

console.log(LZString.decompress(compressed));

I have copied the compressed string to the clipboard and I am attempting to decompress in python using lzip

decompress.py

import lzip

# compressed is the contents of the clipboard from the javascript
compressed = 'ↂテ䀀'

compressed_bytes = str.encode(compressed)
print(compressed_bytes)
print(lzip.decompress

I get the error

RuntimeError: Lzip error: Header error

Is this the right approach?

It appears that the compression implementation is different in lz-string from lzip

When I run

import lzip

compressed = 'ↂテ䀀'

python_compressed = lzip.compress_to_buffer(str.encode('abc'))
print(f'{python_compressed=}')

compressed_bytes = str.encode(compressed)
print(f'{compressed_bytes=}')

I get

python_compressed=b"LZIP\x01\x17\x000\x98\x88\xa4J\x8e\x9f\xff\xf6c\x80\x00\xc2A$5\x03\x00\x00\x00\x00\x00\x00\x00'\x00\x00\x00\x00\x00\x00\x00"

and

compressed_bytes=b'\xe2\x86\x82\xe3\x83\x86\xe4\x80\x80'

So there is a mismatch somewhere

Psionman
  • 3,084
  • 1
  • 32
  • 65
  • What's with the `compressed = 'ↂテ䀀'`? – spectras Aug 18 '23 at 11:09
  • Question updated – Psionman Aug 18 '23 at 11:11
  • I can only assume that there are some header bytes that can not be copied because they are not part of the ascii visible character set. Try saving the compressed string to a file and open it with Notepad++ for example or a Hex editor to make all the bytes visible. – Andreas Gottardi Aug 18 '23 at 11:18
  • 1
    The only examples I see for that package decompress from files, maybe try saving to a file and decompressing that way? – C.Nivs Aug 18 '23 at 11:19
  • according to the docs: def decompress_buffer(buffer, word_size=1): """ Decode a single buffer and return the decompressed bytes as an in-memory buffer - buffer must be a byte-like object, such as bytes or a bytearray - word_size: see "Word size and remaining bytes" This function returns a bytes object """ – Psionman Aug 18 '23 at 11:37
  • @Psionman After some experimentation, it seems [LZString does not use a portable format](https://pieroxy.net/blog/pages/lz-string/index.html#inline_menu_7). However, there is a python implementation available that can handle it: [ls-string-python](https://github.com/eduardtomasek/lz-string-python). This package hasn't seen any commits since 2016, but it seems to work okay: `from lzstring import LZString; LZString.decompress('ↂテ䀀')` -> `'abc'`. – ekhumoro Aug 19 '23 at 12:13
  • @ekhumoro Great - do you want to work it up into an answer? – Psionman Aug 24 '23 at 15:44

1 Answers1

1

The javascript LZString library does not use a portable format, so the output cannot normally be decompressed by other lzip libraries. However, there is a python implementation available that can handle the output:

This package hasn't seen any commits since 2016, but it still seems to work okay:

>>> from lzstring import LZString
>>> LZString.decompress('ↂテ䀀')
'abc'
ekhumoro
  • 115,249
  • 20
  • 229
  • 336