I have HDF5 files that consist numpy.array with dim = (N, M, Q)
, where N - is a number
of such matricies. The main property of them that values are represented as a power of two and have a lot of repeition, so what I definitely mean:
[[0,2,4,16,1024], [2,4,16,512,128], [4,16,128,0,2048] ...]
And I'm looking for good compression. I tested gzip and bzip2, but it seems they are nood good choice in this case. It seems that I need some compression with customer dictionary or something that can really compress good such datasets. I don't have good grasp of filters and compressers, so I decided to ask it while I'm reading different resources about it.
If you have any experience or you have any ideas/recommendations, I'll be very thankful for your help!
Thanks in advance!