How can i remove the first n columns/lines with 0 values in a 2D matrix?

Question

Referring to this previous question:

import numpy as np

data = np.array([[4, 1, 1, 2, 0, 4],
                 [3, 4, 3, 1, 4, 4],
                 [1, 4, 3, 1, 0, 0],
                 [0, 4, 4, 0, 4, 3],
                 [0, 0, 0, 0, 0, 0]])

data = data[~(data==0).all(1)]
print(data)

Output :

    [[4 1 1 2 0 4]
     [3 4 3 1 4 4]
     [1 4 3 1 0 0]
     [0 4 4 0 4 3]]

ok so far so good but what if i add null column?

 data = np.array([[0, 4, 1, 1, 2, 0, 4],
                  [0, 3, 4, 3, 1, 4, 4],
                  [0, 0, 1, 4, 3, 1, 0],
                  [0, 0, 4, 4, 0, 4, 3],
                  [0, 0, 0, 0, 0, 0, 0]])

Output is

          [[0 4 1 1 2 0 4]
           [0 3 4 3 1 4 4]
           [0 1 4 3 1 0 0]
           [0 0 4 4 0 4 3]]

which is not what i want.

Basically if my matrix is :

            [[0, 0, 0, 0, 0, 0, 0, 0, 0],
             [0, 0, 4, 1, 1, 2, 0, 4, 0],
             [0, 0, 3, 4, 3, 1, 4, 4, 0],
             [0, 0, 1, 4, 3, 1, 0, 0, 0],
             [0, 0, 0, 4, 4, 0, 4, 3, 0],
             [0, 0, 0, 0, 0, 0, 0, 0, 0]]

The output i'll be expecting is

        [[4 1 1 2 0 4]
         [3 4 3 1 4 4]
         [1 4 3 1 0 0]
         [0 4 4 0 4 3]]

What if you have null row or cols in between, like through the middle? — Divakar, Jun 21 '18 at 10:47

Divakar · Accepted Answer · 2018-06-21T11:12:12.680

Here's one approach -

def reduced_box(a):
    # Store shape info
    M,N = a.shape

    # Mask of valid places in the array
    mask = a!=0

    # Get boolean array with at least a valid one per row
    m_col = mask.any(1)

    # Get the starting and ending valid rows with argmax.
    # More info : https://stackoverflow.com/a/47269413/
    r0,r1 = m_col.argmax(), M-m_col[::-1].argmax()

    # Repeat for cols
    m_row = mask.any(0)
    c0,c1 = m_row.argmax(), N-m_row[::-1].argmax()

    # Finally slice with the valid indices as the bounding box limits
    return a[r0:r1,c0:c1]

Sample run -

In [210]: a
Out[210]: 
array([[0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 4, 1, 0, 2, 0, 4, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 1, 4, 0, 1, 0, 0, 0],
       [0, 0, 0, 4, 0, 0, 4, 3, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0]])

In [211]: reduced_box(a)
Out[211]: 
array([[4, 1, 0, 2, 0, 4],
       [0, 0, 0, 0, 0, 0],
       [1, 4, 0, 1, 0, 0],
       [0, 4, 0, 0, 4, 3]])

Works and accepted, can you kindly explain the code tho. – Nelly Jun 21 '18 at 11:08 — Nelly, Jun 21 '18 at 11:08

score 1 · Answer 2 · answered Jun 21 '18 at 11:09

You can use scipy.ndimage.measurements.find_objects:

import numpy as np
from scipy.ndimage.measurements import find_objects

data = np.array([[0, 0, 0, 0, 0, 0, 0, 0, 0],
                 [0, 0, 4, 1, 1, 2, 0, 4, 0],
                 [0, 0, 3, 4, 3, 1, 4, 4, 0],
                 [0, 0, 1, 4, 3, 1, 0, 0, 0],
                 [0, 0, 0, 4, 4, 0, 4, 3, 0],
                 [0, 0, 0, 0, 0, 0, 0, 0, 0]])
data[find_objects(data.astype(bool))[0]]
#array([[4, 1, 1, 2, 0, 4],
#       [3, 4, 3, 1, 4, 4],
#       [1, 4, 3, 1, 0, 0],
#       [0, 4, 4, 0, 4, 3]])

I hadn't seen this function before, neat. – miradulo Jun 21 '18 at 12:11 — miradulo, Jun 21 '18 at 12:11

How can i remove the first n columns/lines with 0 values in a 2D matrix?

2 Answers2