I'm looking for an efficient way to check if an array is jagged, where "jagged" means that an element of the array has a different shape from one of it's neighbors in the same dimension.
e.g. [[1, 2], [3, 4, 5]]
or [[1, 2], [3, 4], [5, 6], [[7], [8]]]
Where I'm using list syntax for convenience, but the arguments may be nested lists or nested numpy arrays. I'm also showing integers for convenience, by the lowest-level components could be anything (e.g. generic objects). Let's say the lowest-level objects are not iterable themselves (e.g. str
or dict
, but definitely bonus points for a solution that can handle those too!).
Attemp:
Recursively flattening an array is pretty easy, though I'm guessing quite inefficient, and then the length of the flattened array can be compared to the numpy.size
of the input array. If they match, then it is not jagged.
def really1d(arr):
# Returns false if the given array is not 1D or is a jagged 1D array.
if np.ndim(arr) != 1:
return False
if len(arr) == 0:
return True
if np.any(np.vectorize(np.ndim)(arr)):
return False
return True
def flatten(arr):
# Convert the given array to 1D (even if jagged)
if (not np.iterable(arr)) or really1d(arr):
return arr
return np.concatenate([flatten(aa) for aa in arr])
def isjagged(arr):
if (np.size(arr) == len(flatten(arr))):
return False
return True
I'm pretty sure the concatenations are copying all of the data, which is a complete waste. Maybe there is an itertools
or numpy.flatiter
method of achieving the same goal? Ultimately the flattened array is only being used to find it's length.