15

An n-dimensional array has 2n sides (a 1-dimensional array has 2 endpoints; a 2-dimensional array has 4 sides or edges; a 3-dimensional array has 6 2-dimensional faces; a 4-dimensional array has 8 sides; etc.). This is analogous to what happens with abstract n-dimensional cubes.

I want to check if all sides of an n-dimensional array are composed by only zeros. Here are three examples of arrays whose sides are composed by zeros:

# 1D
np.array([0,1,2,3,0])
# 2D
np.array([[0, 0, 0, 0],
          [0, 1, 0, 0],
          [0, 2, 3, 0],
          [0, 0, 1, 0],
          [0, 0, 0, 0]])
# 3D
np.array([[[0, 0, 0, 0],
           [0, 0, 0, 0],
           [0, 0, 0, 0]],
          [[0, 0, 0, 0],
           [0, 1, 2, 0],
           [0, 0, 0, 0]],
          [[0, 0, 0, 0],
           [0, 0, 0, 0],
           [0, 0, 0, 0]]])

How can I check if all sides of a multidimensional numpy array are arrays of zeros? For example, with a simple 2-dimensional array I can do this:

x = np.random.rand(5, 5)
assert np.sum(x[0:,  0]) == 0
assert np.sum(x[0,  0:]) == 0
assert np.sum(x[0:, -1]) == 0
assert np.sum(x[-1, 0:]) == 0

While this approach works for 2D cases, it does not generalize to higher dimensions. I wonder if there is some clever numpy trick I can use here to make it efficient and also more maintainable.

Riccardo Bucco
  • 13,980
  • 4
  • 22
  • 50
Luca
  • 10,458
  • 24
  • 107
  • 234

5 Answers5

10

Here's how you can do it:

assert(all(np.all(np.take(x, index, axis=axis) == 0)
           for axis in range(x.ndim)
           for index in (0, -1)))

np.take does the same thing as "fancy" indexing.

Riccardo Bucco
  • 13,980
  • 4
  • 22
  • 50
  • This is awesome. I was not sure how I can loop over the axes indices and switch between 0 and -1. Did not know about `np.take` – Luca Mar 23 '20 at 11:03
  • 1
    @Luca: The documentation doesn't make it clear, but `numpy.take` makes a copy. This may cause it to perform worse than code based on a view. (Timing would be necessary to be sure - NumPy view efficiency is sometimes weird.) – user2357112 Mar 23 '20 at 11:05
  • 5
    Also, the use of a list comprehension prevents `all` from short-circuiting. You could remove the brackets to use a generator expression, allowing `all` to return as soon as a single `numpy.all` call returns `False`. – user2357112 Mar 23 '20 at 11:09
  • 1
    @user2357112supportsMonica True!! – Riccardo Bucco Mar 23 '20 at 11:10
5

Here's an answer that actually examines the parts of the array you're interested in, and doesn't waste time constructing a mask the size of the whole array. There's a Python-level loop, but it's short, with iterations proportional to the number of dimensions instead of the array's size.

def all_borders_zero(array):
    if not array.ndim:
        raise ValueError("0-dimensional arrays not supported")
    for dim in range(array.ndim):
        view = numpy.moveaxis(array, dim, 0)
        if not (view[0] == 0).all():
            return False
        if not (view[-1] == 0).all():
            return False
    return True
user2357112
  • 260,549
  • 28
  • 431
  • 505
  • Are there any circumstances where `not (view[0] == 0).all()` isn't equivalent to `view[0].any()`? – Paul Panzer Mar 24 '20 at 00:01
  • @PaulPanzer: I suppose `view[0].any()` would work too. I'm not entirely sure of the efficiency implications of the casting and buffering involved in the two options - `view[0].any()` could theoretically be implemented faster, but I've seen weird results before, and I don't fully understand the buffering involved. – user2357112 Mar 24 '20 at 00:05
  • I suppose `view[0].view(bool).any()` would be the high-speed solution. – Paul Panzer Mar 24 '20 at 00:15
  • @PaulPanzer: [`argmax` might actually beat `any` over the boolean view](https://stackoverflow.com/questions/45771554/why-numpy-any-has-no-short-circuit-mechanism/45774536#45774536). This stuff gets weird. – user2357112 Mar 24 '20 at 00:17
  • (Also, whether `argmax` or `any`, using a boolean view means handling negative zero as unequal to regular zero.) – user2357112 Mar 24 '20 at 00:19
  • Actually, wait - `view(bool)` isn't safe, because we can't guarantee that arbitrary nonzero bit patterns are treated as equivalent to the bit pattern normally used for a true boolean. – user2357112 Mar 24 '20 at 00:21
  • It's easy to check and it works as expected. But you are still right in that this probably is an implementation detail, not a spec. – Paul Panzer Mar 24 '20 at 00:28
2

I reshaped the array and then iterated through it. Unfortunately, my answer assumes you have at least three dimensions and will error out for normal matrices, you would have to add a special clause for 1 & 2 dimensional shaped arrays. In addition, this will be slow so there are likely better solutions.

x = np.array(
        [
            [
                [0 , 1, 1, 0],
                [0 , 2, 3, 0],
                [0 , 4, 5, 0]
            ],
            [
                [0 , 6, 7, 0],
                [0 , 7, 8, 0],
                [0 , 9, 5, 0]
            ]
        ])

xx = np.array(
        [
            [
                [0 , 0, 0, 0],
                [0 , 2, 3, 0],
                [0 , 0, 0, 0]
            ],
            [
                [0 , 0, 0, 0],
                [0 , 7, 8, 0],
                [0 , 0, 0, 0]
            ]
        ])

def check_edges(x):

    idx = x.shape
    chunk = np.prod(idx[:-2])
    x = x.reshape((chunk*idx[-2], idx[-1]))
    for block in range(chunk):
        z = x[block*idx[-2]:(block+1)*idx[-2], :]
        if not np.all(z[:, 0] == 0):
            return False
        if not np.all(z[:, -1] == 0):
            return False
        if not np.all(z[0, :] == 0):
            return False
        if not np.all(z[-1, :] == 0):
            return False

    return True

Which will produce

>>> False
>>> True

Basically I stack all the dimensions on top of each other and then look through them to check their edges.

lwileczek
  • 2,084
  • 18
  • 27
1

You can make use of slice and boolean masking to get the job done:

def get_borders(arr):
    s=tuple(slice(1,i-1) for i in a.shape)
    mask = np.ones(arr.shape, dtype=bool)
    mask[s] = False
    return(arr[mask])

This function first shapes the "core" of the array into the tuple s, and then builds a mask that shows True only for the bordering points. Boolean indexing then delivers the border points.

Working example:

a = np.arange(16).reshape((4,4))

print(a)
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15]])

borders = get_borders(a)
print(borders)
array([ 0,  1,  2,  3,  4,  7,  8, 11, 12, 13, 14, 15])

Then, np.all(borders==0) will give you the desired information.


Note: this breaks for one-dimensional arrays, though I consider those an edge case. You're probably better off just checking the two points in question there

Lukas Thaler
  • 2,672
  • 5
  • 15
  • 31
  • This takes time proportional to the total number of elements in the array, instead of just the border. Also, one-dimensional arrays are not an irrelevant edge case. – user2357112 Mar 23 '20 at 10:47
  • 1
    Also, `np.arange(15)` doesn't include 15. – user2357112 Mar 23 '20 at 10:57
  • I agree that "irrelevant" is a strong wording, though I feel you're better off just checking the two concerning points for a 1d array. The 15 is a typo, good catch – Lukas Thaler Mar 23 '20 at 10:59
1

maybe the ellipsis operator is what you are looking for, which will work for many dimensions:

import numpy as np

# data
x = np.random.rand(2, 5, 5)
x[..., 0:, 0] = 0
x[..., 0, 0:] = 0
x[..., 0:, -1] = 0
x[..., -1, 0:] = 0

test = np.all(
    [
        np.all(x[..., 0:, 0] == 0),
        np.all(x[..., 0, 0:] == 0),
        np.all(x[..., 0:, -1] == 0),
        np.all(x[..., -1, 0:] == 0),
    ]
)

print(test)
daveg
  • 535
  • 2
  • 10