0

[Solutions] Identify the first & last non-zero elements/indices within a group in numpy

========= [Previous question]. Please ignore the followings =========

I have numpy arrays like the following:

group = np.array([1,1,1,1,1,1,1,2,2,2,2,2,2,2])  
arr1 = np.array([0,0,0,np.nan,2,np.nan,np.nan,0,np.nan,2,np.nan,np.nan,0,0])  
arr2 = np.array([0,0,0,np.nan,np.nan,3,np.nan,0,np.nan,2,np.nan,np.nan,0,0])  

target_arr = np.array([0,0,0,0,2,2,2,0,0,2,2,2,0,0])

For group 1, the first non-zero/nan element is 2 in arr1 at index 4. For group 2, it is 2 in arr1 and arr2 at index 2. How do I identify the first-appearing & the first-non-zero/nan value for each group in multiple arrays (i.e. only one value for each group), and ffill the values to create one array (like target_arr aobve) without iteration?

I found a similar answer by using pandas. How do I do it in pure numpy? Identify first non-zero element within a group in pandas

wuya
  • 85
  • 1
  • 6
  • Do you include NaN in "non-zero"? Or would you treat it as a zero? –  Nov 16 '21 at 19:24
  • 1
    @user17242583 Thanks for your quick reply. Treat it as a zero please. Or you may ffill arr1 to get an array like np.array([0,0,0,0,2,2,2,0,0,2,2,2,0,0]) – wuya Nov 16 '21 at 19:31

1 Answers1

1

There are few essential things you can do:

  • precalculate target_arr (use advanced indexing):

    arr = np.array([arr1, arr2])
    arr = arr[group-1, np.arange(len(group))]
    >>> arr
    array([ 0.,  0.,  0., nan,  2., nan, nan,  0., nan,  2., nan, nan,  0., 0.])
    
  • find indices of nan values (use np.flatnonzero):

    idx = np.flatnonzero(np.isnan(arr)) #idx of nan values: [3,  5,  6,  8, 10, 11]
    
  • find indices of the most previous non nan items (use np.maximum.accumulate):

    prev = np.arange(len(arr))
    prev[idx] = 0
    prev = np.maximum.accumulate(prev)
    >>> prev
    array([ 0,  1,  2,  2,  4,  4,  4,  7,  7,  9,  9,  9, 12, 13], dtype=int32)
    
  • fill them:

    >>> arr[prev]
    array([0., 0., 0., 0., 2., 2., 2., 0., 0., 2., 2., 2., 0., 0.])
    
mathfux
  • 5,759
  • 1
  • 14
  • 34
  • Thanks for your answer. I rephrased my question a little bit. Sorry for the confusion. – wuya Nov 22 '21 at 16:34