Efficiently save 4D array with binary info

Question

I have a 4D array where every value at axis=3 is either a 1 or a 0. I've tried saving this as an array in a .npy file. But for a (252,512,512,6) array, this already gave 3GB of data. I am wondering if it is possible to store these kind of data in a much more efficient way. Thus drastically lowering the filesize.

I've already tried using "False" and "True", and i got it down to about 400MB, but I am still looking of it is possible to further reduce said number. Either via the datatype of the way I am saving it.

*"... where every value at axis=3 is either a 1 or a 0."* Does that mean *all* the values in the array are either 0 or 1? — Warren Weckesser, Feb 20 '20 at 16:05
Does this answer your question? [Compress numpy arrays efficiently](https://stackoverflow.com/questions/22400652/compress-numpy-arrays-efficiently) — AMC, Feb 20 '20 at 16:41

sacuL · Answer 1 · 2020-02-20T16:19:51.163

1

You can use np.savez_compressed, which will significantly compress the array and reduce the filesize:

# create sample array:
>>> x = np.random.randint(1, 30, size=(252, 512, 512, 6))

>>> np.savez("test.npz", x)
# test.npz is 2.95GB

>>> np.savez_compressed("test2.npz", arr = x)
# test2.npz is 369MB

To re-load your array, use

>>> loaded = np.load("test2.npz")
>>> x2 = loaded["arr"]

And you can test that x2 (your re-loaded array), is equal to x (your original array)

>>> np.array_equal(x, x2)
True

edited Feb 20 '20 at 16:19

answered Feb 20 '20 at 16:13

sacuL

49,704
8
81
106

1

Thank you! It's actually a really logical solution. This together with using True and False got it down to .5 MB – drjeffrey Feb 20 '20 at 16:39

Efficiently save 4D array with binary info

1 Answers1