numpy python raising non zero values to top of 2d array quickly and efficiently

Question

Sorry about the title, I would be looking for a suggestion if someone has a better description. I want a function (that is as quick as possible) that gets the non-zero entries and populates a new array with the ordered version of the previous array. It probably is clearer from the example below:

Input Array

np.random.seed(2)
a = np.random.randint(0,10,10)
b = np.random.randint(0,10,10)
c = np.random.randint(0,10,10)
a = 0 * (a % 2) + (1-(a % 2))*a
b = 0 * (b % 2) + (1-(b % 2))*b
c = 0 * (c % 2) + (1-(c % 2))*c
arr = np.array([a,b,c])

arr
>>> array([[8, 8, 6, 2, 8, 0, 2, 0, 0, 4],
           [4, 0, 0, 0, 6, 4, 0, 0, 6, 0],
           [0, 0, 8, 4, 6, 0, 0, 2, 0, 4]])

Output Array

outArr = np.empty_like(arr)
outArr[0,:] = (arr[0,:] > 0) * arr[0,:] + ~(arr[0,:] > 0) * (arr[1,:] > 0) * arr[1,:] + ~(arr[0,:] > 0) * ~(arr[1,:] > 0) * arr[2,:]
outArr[1,:] = (arr[0,:] > 0) * arr[1,:] + (arr[0,:] > 0) * ~(arr[1,:] > 0) * arr[2,:]
outArr[2,:] = (arr[0,:] > 0) * (arr[1,:] > 0) * arr[2,:]

outArr
>>> array([[8, 8, 6, 2, 8, 4, 2, 2, 6, 4],
           [4, 0, 8, 4, 6, 0, 0, 0, 0, 4],
           [0, 0, 0, 0, 6, 0, 0, 0, 0, 0]])

Where I have hard coded this array to be 3 rows only so I can hand type the function, in reality this could be more rows (on the order of tens nothing too crazy).

EDIT:

The dimensions that I would actually like to use are 5ish rows by 100-150k columns

The data type will always be integers

Finally, the update process is I add a new row at the bottom, justify upwards, and then remove all trailing rows of only 0s (null values)

Can you add the actual input information into the question? Considering the particular shape, there could be better ways. — Divakar, May 14 '19 at 18:12
@Divakar added info, please let me know if that's sufficient — qwertylpc, May 14 '19 at 20:41

Divakar · Accepted Answer · 2019-05-14T19:34:00.023

Approach #1

Inspired by justify, here's one fine-tuned for up-justification and for cases when sorting could slow things down, so an alternative one with broadcasted-mask-creation could be suggested -

def justify_up(a, invalid_val=0, use_sort=True):
    if invalid_val is np.nan:
        mask = ~np.isnan(a)
    else:
        mask = a!=invalid_val

    if use_sort==1:
        justified_mask = np.sort(mask,axis=0)[::-1]
    else:
        justified_mask = (mask.sum(0) > np.arange(a.shape[0])[:,None])

    if invalid_val is 0:
        out = np.zeros_like(a)
    elif invalid_val is 1:
        out = np.ones_like(a)
    else:
        out = np.full(a.shape, invalid_val)

    out.T[justified_mask.T] = a.T[mask.T]
    return out

Sample run -

In [199]: arr
Out[199]: 
array([[8, 8, 6, 2, 8, 0, 2, 0, 0, 4],
       [4, 0, 0, 0, 6, 4, 0, 0, 6, 0],
       [0, 0, 8, 4, 6, 0, 0, 2, 0, 4]])

In [200]: justify_up(arr, invalid_val=0)
Out[200]: 
array([[8, 8, 6, 2, 8, 4, 2, 2, 6, 4],
       [4, 0, 8, 4, 6, 0, 0, 0, 0, 4],
       [0, 0, 0, 0, 6, 0, 0, 0, 0, 0]])

Approach #2

We can also offload the work with loops to numba for performance for in-situ edit -

from numba import njit

@njit
def justify_up_numba(a, invalid_val=0):
    # invalid_val : Any number but NaN
    m,n = a.shape
    for j in range(m-1):
        for i in range(0,m-j-1):
            for k in range(n):
                if a[i,k]==invalid_val:
                    a[i,k] = a[i+1,k]
                    a[i+1,k] = invalid_val      
    return a

Timings on large array -

In [361]: np.random.seed(0)
     ...: arr = np.random.randint(0,5,(10,100000))

In [362]: %timeit justify_up(arr, invalid_val=0, use_sort=False)
100 loops, best of 3: 10.9 ms per loop

In [363]: %timeit justify_up(arr, invalid_val=0, use_sort=True)
100 loops, best of 3: 15.9 ms per loop

In [364]: %timeit justify_up_numba(arr, invalid_val=0)
100 loops, best of 3: 2.38 ms per loop

Darn you crushed those runtimes, I will have to read more about numba. Thank you alot! — qwertylpc, May 16 '19 at 16:49

numpy python raising non zero values to top of 2d array quickly and efficiently

1 Answers1