7

This is an extension of the question posed here (quoted below)

I have a matrix (2d numpy ndarray, to be precise):

A = np.array([[4, 0, 0],
              [1, 2, 3],
              [0, 0, 5]])

And I want to roll each row of A independently, according to roll values in another array:

r = np.array([2, 0, -1])

That is, I want to do this:

print np.array([np.roll(row, x) for row,x in zip(A, r)])

[[0 0 4]
 [1 2 3]
 [0 5 0]]

Is there a way to do this efficiently? Perhaps using fancy indexing tricks?

The accepted solution was:

rows, column_indices = np.ogrid[:A.shape[0], :A.shape[1]]

# Use always a negative shift, so that column_indices are valid.
# (could also use module operation)
r[r < 0] += A.shape[1]
column_indices = column_indices - r[:,np.newaxis]

result = A[rows, column_indices]

I would basically like to do the same thing, except when an index gets rolled "past" the end of the row, I would like the other side of the row to be padded with a NaN, rather than the value move to the "front" of the row in a periodic fashion.

Maybe using np.pad somehow? But I can't figure out how to get that to pad different rows by different amounts.

hm8
  • 1,381
  • 3
  • 21
  • 41
  • It _might_ be more efficient to do this in two steps so you don't need to pad: first roll the rows as in the previous question, then set the r leftmost (and -r rightmost) values of each row to NaN. – abarnert Jul 05 '18 at 22:06
  • @abarnert Would this be using the values in 'r' before doing the negative check? (`r[r < 0] += A.shape[1]`) EDIT: Also tricky how to figure out how to do this without looping through r – hm8 Jul 05 '18 at 22:18
  • I would create a `nan` filled array, and then use indexing like this to copy rolled values to it. But your `I want to do` matrix doesn't show this `nan` fill! – hpaulj Jul 05 '18 at 22:30
  • This would be after the entire roll operation you show above. First roll, then… basically what @hpaulj said to overwrite the values that rolled around with nans. And actually, the only way I can think of doing the second step (without looping) is to do it twice, one using just the positive elements of r to copy from the nan array to the left side, then using just the negative elements to copy to the right side, but I don't think that'll be an efficiency issue. But it is getting pretty far from simple and elegant, and hopefully one of the numpy wizards will come along with an obvious one-liner… – abarnert Jul 05 '18 at 22:40

3 Answers3

7

Inspired by Roll rows of a matrix independently's solution, here's a vectorized one based on np.lib.stride_tricks.as_strided -

from skimage.util.shape import view_as_windows as viewW

def strided_indexing_roll(a, r):
    # Concatenate with sliced to cover all rolls
    p = np.full((a.shape[0],a.shape[1]-1),np.nan)
    a_ext = np.concatenate((p,a,p),axis=1)

    # Get sliding windows; use advanced-indexing to select appropriate ones
    n = a.shape[1]
    return viewW(a_ext,(1,n))[np.arange(len(r)), -r + (n-1),0]

Sample run -

In [76]: a
Out[76]: 
array([[4, 0, 0],
       [1, 2, 3],
       [0, 0, 5]])

In [77]: r
Out[77]: array([ 2,  0, -1])

In [78]: strided_indexing_roll(a, r)
Out[78]: 
array([[nan, nan,  4.],
       [ 1.,  2.,  3.],
       [ 0.,  5., nan]])
cs95
  • 379,657
  • 97
  • 704
  • 746
Divakar
  • 218,885
  • 19
  • 262
  • 358
0

I was able to hack this together with linear indexing...it gets the right result but performs rather slowly on large arrays.

A = np.array([[4, 0, 0],
              [1, 2, 3],
              [0, 0, 5]]).astype(float)

r = np.array([2, 0, -1])

rows, column_indices = np.ogrid[:A.shape[0], :A.shape[1]]

# Use always a negative shift, so that column_indices are valid.
# (could also use module operation)
r_old = r.copy()
r[r < 0] += A.shape[1]
column_indices = column_indices - r[:,np.newaxis]

result = A[rows, column_indices]

# replace with NaNs
row_length = result.shape[-1]

pad_inds = []
for ind,i in np.enumerate(r_old):
    if i > 0:
        inds2pad = [np.ravel_multi_index((ind,) + (j,),result.shape) for j in range(i)]
        pad_inds.extend(inds2pad)
    if i < 0:
        inds2pad = [np.ravel_multi_index((ind,) + (j,),result.shape) for j in range(row_length+i,row_length)]
        pad_inds.extend(inds2pad)
result.ravel()[pad_inds] = nan

Gives the expected result:

print result

[[ nan  nan   4.]
 [  1.   2.   3.]
 [  0.   5.  nan]]
hm8
  • 1,381
  • 3
  • 21
  • 41
0

Based on @Seberg and @yann-dubois answers in the non-nan case, I've written a method that:

  • Is faster than the current answer
  • Works on ndarrays of any shape (specify the row-axis using the axis argument)
  • Allows for setting fill to either np.nan, any other "fill value" or False to allow regular rolling across the array edge.

Benchmarking

cols, rows = 1024, 2048
arr = np.stack(rows*(np.arange(cols,dtype=float),))
shifts = np.random.randint(-cols, cols, rows)

np.testing.assert_array_almost_equal(row_roll(arr, shifts), strided_indexing_roll(arr, shifts))
# True

%timeit row_roll(arr, shifts)
# 25.9 ms ± 161 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit strided_indexing_roll(arr, shifts)
# 29.7 ms ± 446 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
def row_roll(arr, shifts, axis=1, fill=np.nan):
    """Apply an independent roll for each dimensions of a single axis.

    Parameters
    ----------
    arr : np.ndarray
        Array of any shape.

    shifts : np.ndarray, dtype int. Shape: `(arr.shape[:axis],)`.
        Amount to roll each row by. Positive shifts row right.

    axis : int
        Axis along which elements are shifted. 
        
    fill: bool or float
        If True, value to be filled at missing values. Otherwise just rolls across edges.
    """
    if np.issubdtype(arr.dtype, int) and isinstance(fill, float):
        arr = arr.astype(float)

    shifts2 = shifts.copy()
    arr = np.swapaxes(arr,axis,-1)
    all_idcs = np.ogrid[[slice(0,n) for n in arr.shape]]
    # Convert to a positive shift
    shifts2[shifts2 < 0] += arr.shape[-1] 
    all_idcs[-1] = all_idcs[-1] - shifts2[:, np.newaxis]

    result = arr[tuple(all_idcs)]

    if fill is not False:
        # Create mask of row positions above negative shifts
        # or below positive shifts. Then set them to np.nan.
        *_, nrows, ncols  = arr.shape

        mask_neg = shifts < 0
        mask_pos = shifts >= 0
        
        shifts_pos = shifts.copy()
        shifts_pos[mask_neg] = 0
        shifts_neg = shifts.copy()
        shifts_neg[mask_pos] = ncols+1 # need to be bigger than the biggest positive shift
        shifts_neg[mask_neg] = shifts[mask_neg] % ncols

        indices = np.stack(nrows*(np.arange(ncols),))
        nanmask = (indices < shifts_pos[:, None]) | (indices >= shifts_neg[:, None])
        result[nanmask] = fill

    arr = np.swapaxes(result,-1,axis)

    return arr
TomNorway
  • 2,584
  • 1
  • 19
  • 26