7

I have a simple array (say length 1000) of objects in zarr. I want to replace it with a slimmed down version, picking only a subset of the items, as specified using a boolean array of size 1000. I want to keep everything else the same (e.g. if this array is a persistent one, I want to change the array on disk as well as in memory).

I can't simply reassign the array:

my_zarr_data = my_zarr_data[:][selected_items]

Because then I get the error ValueError: missing object_codec for object array.

Another option would be to make a copy, delete all the data, then add it back from the original using append(), but I can't see how to clear a zarr array while keeping the object_codec and other params the same (perhaps I could just do resize(0)?).

At the moment I'm resizing to the length of sum(selected_items) and then using my_zarr_data.set_basic_selection(..., my_zarr_data[:][selected_items]).

Is that right? Is there a more efficient way to permanently reassign an array to (say) the return value from get_mask_selection()?

NoDataDumpNoContribution
  • 10,591
  • 9
  • 64
  • 104
user2667066
  • 1,867
  • 2
  • 19
  • 30
  • Suggest raising this as an issue on GitHub, ValueError suggests a bug. If you could provide a minimal reproducible example that causes the ValueError that would be very helpful. – Alistair Miles Nov 07 '19 at 12:27
  • 1
    Thanks @alistair-miles. Opened at https://github.com/zarr-developers/zarr-python/issues/502. The follow up question is whether it is possible to (efficiently) shrink a zarr array in-place by subsetting (e.g. using a boolean mask), rather than by using `resize()` and then `set_basic_selection()` to replace the entire array in one swoop? – user2667066 Nov 08 '19 at 12:03

0 Answers0