3

I have a 1D-array, that I want to transform into a 2D-array, where in row i the original 1D is rolled by i steps. I implemented it like this

import numpy as np
data=np.arange(0,10,2)
rolling=np.arange(len(data))

array=np.array([np.roll(data,-i) for i in rolling])

array
array([[0, 2, 4, 6, 8],
       [2, 4, 6, 8, 0],
       [4, 6, 8, 0, 2],
       [6, 8, 0, 2, 4],
       [8, 0, 2, 4, 6]])

For later purposes I would like to have the rolling in a way, such that the array is not rolled over edge, and the respective values are replaced by something else, for example np.nan.

My intended output is

array([[0, 2, 4, 6, 8],
       [2, 4, 6, 8, np.nan],
       [4, 6, 8, np.nan, np.nan],
       [6, 8, np.nan, np.nan, np.nan],
       [8, np.nan, np.nan, np.nan, np.nan]])

The data is not necessarily as uniform as in this example, so a detection of the edge is not possible as it would be in the example. I tried around with padding, but it is neither short nor convenient, since every row would need a different padding. Also I was thinking about np.tril or np.triu, but those worked only for the main diagonal, but the rolling edge is not along the main diagonal. In this example it is on the counter diagonal, but that might shift in the real example, which would look like

array=np.array([np.roll(data,-i+manualshift) for i in rolling])

EDIT: Additional example

If I introduce a larger matrix and introduce an additional shift like this

data=np.arange(0,20,2)
rolling=np.arange(len(data))
manualshift=3
array=np.array([np.roll(data,-i+manualshift) for i in rolling])

then the array would look like this:

array([[nan, nan, nan,  0,  2,  4,  6,  8, 10, 12],
       [nan, nan,  0,  2,  4,  6,  8, 10, 12, 14],
       [nan,  0,  2,  4,  6,  8, 10, 12, 14, 16],
       [ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18],
       [ 2,  4,  6,  8, 10, 12, 14, 16, 18,  nan],
       [ 4,  6,  8, 10, 12, 14, 16, 18,  nan,  nan],
       [ 6,  8, 10, 12, 14, 16, 18,  nan,  nan,  nan],
       [ 8, 10, 12, 14, 16, 18,  nan,  nan,  nan,  nan],
       [10, 12, 14, 16, 18,  nan,  nan,  nan,  nan,  nan],
       [12, 14, 16, 18,  nan,  nan,  nan,  nan,  nan, nan]])

EDIT END


Is there a short solution for this?

Lepakk
  • 419
  • 1
  • 6
  • 20

1 Answers1

3

Approach #1

There's a built-in hankel-matrix for that and might solve for your "short" solution requirement -

In [43]: from scipy.linalg import hankel

In [59]: hankel(data,np.full(len(data),np.nan))
Out[59]: 
array([[ 0.,  2.,  4.,  6.,  8.],
       [ 2.,  4.,  6.,  8., nan],
       [ 4.,  6.,  8., nan, nan],
       [ 6.,  8., nan, nan, nan],
       [ 8., nan, nan, nan, nan]])

Approach #2

Another based on NumPy-strides -

In [49]: from skimage.util import view_as_windows

In [50]: b = np.r_[data,np.full(len(data)-1,np.nan)]

In [51]: view_as_windows(b,len(data))
Out[51]: 
array([[ 0.,  2.,  4.,  6.,  8.],
       [ 2.,  4.,  6.,  8., nan],
       [ 4.,  6.,  8., nan, nan],
       [ 6.,  8., nan, nan, nan],
       [ 8., nan, nan, nan, nan]])

More info on use of as_strided based view_as_windows.

NumPy native way for getting sliding windows.

Approach #3

Another short way re-using b from earlier step -

In [56]: b[np.add.outer(*[np.arange(len(data))]*2)]
Out[56]: 
array([[ 0.,  2.,  4.,  6.,  8.],
       [ 2.,  4.,  6.,  8., nan],
       [ 4.,  6.,  8., nan, nan],
       [ 6.,  8., nan, nan, nan],
       [ 8., nan, nan, nan, nan]])

Approach #4

Pandas way -

In [65]: import pandas as pd

In [66]: pd.DataFrame([data[i:] for i in range(len(data))]).values
Out[66]: 
array([[ 0.,  2.,  4.,  6.,  8.],
       [ 2.,  4.,  6.,  8., nan],
       [ 4.,  6.,  8., nan, nan],
       [ 6.,  8., nan, nan, nan],
       [ 8., nan, nan, nan, nan]])

Approach #5

With itertools -

In [93]: from itertools import zip_longest

In [94]: d = [data[i:] for i in range(len(data))]

In [95]: np.array(list(zip_longest(*d, fillvalue=np.nan)))
Out[95]: 
array([[ 0.,  2.,  4.,  6.,  8.],
       [ 2.,  4.,  6.,  8., nan],
       [ 4.,  6.,  8., nan, nan],
       [ 6.,  8., nan, nan, nan],
       [ 8., nan, nan, nan, nan]])

Incorporating manualshift

To incorporate manualshift for a custom-padding at the leading side, we can extend Approach #2, #3. So, b needs to be changed with something like the following, while keeping rest the same -

b = np.r_[np.full(manualshift,np.nan),data,np.full(len(data)-manualshift-1,np.nan)]

For a shorter alternative, please look into np.pad for padding with NaNs.

Divakar
  • 218,885
  • 19
  • 262
  • 358
  • Thank you for this great variety of options; it's great to learn about all these ways. I tried to understand them, but still I think, they don't meet all requirements. Maybe I didn't mention it clear enough in the question: In case I have an additional manual shift, the array structure becomes a bit more complicated. Then the upper left corner gets some of the `nan`, then the "anti-diagonals" from the original array, and then the rest of the `nan`s. Is there also a short solution for this or is this getting more lengthy? – Lepakk Jan 10 '20 at 20:33
  • I added another example in the original post, that hopefully helps to make the problem clear. – Lepakk Jan 10 '20 at 20:42
  • Thank you. Now that I understood, what `np.r`is doing: I replaced that line by `b = np.concatenate(...)`, and it is also working. Is there any difference between these approaches or are they completely the same? – Lepakk Jan 11 '20 at 22:05
  • @Lepakk `np.r_` is just an easier way to concatenate with some minimal functional overhead, I think. – Divakar Jan 12 '20 at 05:50