Below are the specific answers I found along with example codes.
1. Can we generate and use custom deflate dictionary from a preset of words?
Yes, this can be done. A quick example in python will as below:
import zlib
#Data for compression
hello = b'hello'
#Compress with dictionary
co = zlib.compressobj(wbits=-zlib.MAX_WBITS, zdict=hello)
compress_data = co.compress(hello) + co.flush()
2. Can we send a file without the deflate dictionary and use local one?
Yes, you can send just the data without dictionary. The compressed data is in compress_data
in above example code. However, to decompress you will need the zdict
value passed during compression. Example of how it is decompressed:
hello = b'hello' #for passing to zdict
do = zlib.decompressobj(wbits=-zlib.MAX_WBITS, zdict=hello)
data = do.decompress(compress_data)
A full example code with and without dict data:
import zlib
#Data for compression
hello = b'hello'
#Compression with dictionary
co = zlib.compressobj(wbits=-zlib.MAX_WBITS, zdict=hello)
compress_data = co.compress(hello) + co.flush()
#Compression without dictionary
co_nodict = zlib.compressobj(wbits=-zlib.MAX_WBITS, )
compress_data_nodict = co_nodict.compress(hello) + co_nodict.flush()
#De-compression with dictionary
do = zlib.decompressobj(wbits=-zlib.MAX_WBITS, zdict=hello)
data = do.decompress(compress_data)
#print compressed output when dict used
print(compress_data)
#print compressed output when dict not used
print(compress_data_nodict)
#print decompressed output when dict used
print(data)
Above code doesn't works with unicode data. For unicode data you have to do something as below:
import zlib
#Data for compression
unicode_data = 'റെക്കോർഡ്'
hello = unicode_data.encode('utf-16be')
#Compression with dictionary
co = zlib.compressobj(wbits=-zlib.MAX_WBITS, zdict=hello)
compress_data = co.compress(hello) + co.flush()
...
JS based approach references:
- How to find a good/optimal dictionary for zlib 'setDictionary' when processing a given set of data?
- Compression of data with dictionary using zlib in node.js