0

I am stuck on a problem statement where I have data as shown below :-

Data

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20

Now I want to generate matrix as shown in figure below . That is matrix having 10 columns with each row having 10 previous values :-

enter image description here

I could not find any optimal or built in method to convert series to matrix lik this . Can anybody help me to achieve this ?

bharatk
  • 4,202
  • 5
  • 16
  • 30
Paras Ghai
  • 363
  • 1
  • 6
  • 14

4 Answers4

1

I believe you need strides with 0 instead NaNs with DataFrame constructor:

df = pd.DataFrame({'data':range(1, 21)})
#print (df)

n = 10
x = np.concatenate([[0] * (n), df['data'].values])

def rolling_window(a, window):
    shape = a.shape[:-1] + (a.shape[-1] - window + 1, window)
    strides = a.strides + (a.strides[-1],)
    return np.lib.stride_tricks.as_strided(a, shape=shape, strides=strides)

arr = rolling_window(x, n)
arr = np.concatenate([arr[:10, ::-1], arr[10:]])

arr = pd.DataFrame(arr)
print (arr)
     0   1   2   3   4   5   6   7   8   9
0    0   0   0   0   0   0   0   0   0   0
1    1   0   0   0   0   0   0   0   0   0
2    2   1   0   0   0   0   0   0   0   0
3    3   2   1   0   0   0   0   0   0   0
4    4   3   2   1   0   0   0   0   0   0
5    5   4   3   2   1   0   0   0   0   0
6    6   5   4   3   2   1   0   0   0   0
7    7   6   5   4   3   2   1   0   0   0
8    8   7   6   5   4   3   2   1   0   0
9    9   8   7   6   5   4   3   2   1   0
10   1   2   3   4   5   6   7   8   9  10
11   2   3   4   5   6   7   8   9  10  11
12   3   4   5   6   7   8   9  10  11  12
13   4   5   6   7   8   9  10  11  12  13
14   5   6   7   8   9  10  11  12  13  14
15   6   7   8   9  10  11  12  13  14  15
16   7   8   9  10  11  12  13  14  15  16
17   8   9  10  11  12  13  14  15  16  17
18   9  10  11  12  13  14  15  16  17  18
19  10  11  12  13  14  15  16  17  18  19
20  11  12  13  14  15  16  17  18  19  20
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
1

Much lower-tech approach than @jezrael's is using the combination of two list comprehensions, that can easily be transform to dataframe if you need one:

first = [[j if j <= row else 0 for j in range(1, 11)] for row in a[:9]]
second = [[j for j in range(row - 9, row + 1)] for row in a[9:]]
first + second

Output:

[[1, 0, 0, 0, 0, 0, 0, 0, 0, 0],
 [1, 2, 0, 0, 0, 0, 0, 0, 0, 0],
 [1, 2, 3, 0, 0, 0, 0, 0, 0, 0],
 [1, 2, 3, 4, 0, 0, 0, 0, 0, 0],
 [1, 2, 3, 4, 5, 0, 0, 0, 0, 0],
 [1, 2, 3, 4, 5, 6, 0, 0, 0, 0],
 [1, 2, 3, 4, 5, 6, 7, 0, 0, 0],
 [1, 2, 3, 4, 5, 6, 7, 8, 0, 0],
 [1, 2, 3, 4, 5, 6, 7, 8, 9, 0],
 [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
 [2, 3, 4, 5, 6, 7, 8, 9, 10, 11],
 [3, 4, 5, 6, 7, 8, 9, 10, 11, 12],
 [4, 5, 6, 7, 8, 9, 10, 11, 12, 13],
 [5, 6, 7, 8, 9, 10, 11, 12, 13, 14],
 [6, 7, 8, 9, 10, 11, 12, 13, 14, 15],
 [7, 8, 9, 10, 11, 12, 13, 14, 15, 16],
 [8, 9, 10, 11, 12, 13, 14, 15, 16, 17],
 [9, 10, 11, 12, 13, 14, 15, 16, 17, 18],
 [10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
 [11, 12, 13, 14, 15, 16, 17, 18, 19, 20]]
Aryerez
  • 3,417
  • 2
  • 9
  • 17
  • @coder I saw it, but it looks like a typo, since it doesn't make sense to be all zeros if the second line has 1 and 2. – Aryerez Nov 12 '19 at 09:51
1

Another solution using numpy:

x = np.ones([10,10]) * np.arange(1,11)
y = np.tril(x,0)
y[0,0]=0
z = np.ones(10) * np.arange(1,11).reshape(10,1)
res = np.concatenate((y,x+z),axis=0)

Output:

>>> x = np.ones([10,10]) * np.arange(1,11)
>>> y = np.tril(x,0)
>>> y[0,0]=0
>>> z = np.ones(10) * np.arange(1,11).reshape(10,1)
>>> res = np.concatenate((y,x+z),axis=0)
>>> res
array([[ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 1.,  2.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 1.,  2.,  3.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 1.,  2.,  3.,  4.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 1.,  2.,  3.,  4.,  5.,  0.,  0.,  0.,  0.,  0.],
       [ 1.,  2.,  3.,  4.,  5.,  6.,  0.,  0.,  0.,  0.],
       [ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  0.,  0.,  0.],
       [ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  0.,  0.],
       [ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9.,  0.],
       [ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.],
       [ 2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10., 11.],
       [ 3.,  4.,  5.,  6.,  7.,  8.,  9., 10., 11., 12.],
       [ 4.,  5.,  6.,  7.,  8.,  9., 10., 11., 12., 13.],
       [ 5.,  6.,  7.,  8.,  9., 10., 11., 12., 13., 14.],
       [ 6.,  7.,  8.,  9., 10., 11., 12., 13., 14., 15.],
       [ 7.,  8.,  9., 10., 11., 12., 13., 14., 15., 16.],
       [ 8.,  9., 10., 11., 12., 13., 14., 15., 16., 17.],
       [ 9., 10., 11., 12., 13., 14., 15., 16., 17., 18.],
       [10., 11., 12., 13., 14., 15., 16., 17., 18., 19.],
       [11., 12., 13., 14., 15., 16., 17., 18., 19., 20.]])
coder
  • 12,832
  • 5
  • 39
  • 53
1
import numpy as np

zeros = np.zeros((20,10))
numbers = np.arange(1,21)
for index, value in enumerate(numbers):
    if index < 10:
        arr = np.pad(numbers[:index],(0, 10-index), mode='constant')
    else:
        arr = numbers[index-10:index]
    zeros[index] = arr
print(numbers)

this gives:

zeros...
array([[ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 1.,  2.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 1.,  2.,  3.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 1.,  2.,  3.,  4.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 1.,  2.,  3.,  4.,  5.,  0.,  0.,  0.,  0.,  0.],
       [ 1.,  2.,  3.,  4.,  5.,  6.,  0.,  0.,  0.,  0.],
       [ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  0.,  0.,  0.],
       [ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  0.,  0.],
       [ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9.,  0.],
       [ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.],
       [ 2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10., 11.],
       [ 3.,  4.,  5.,  6.,  7.,  8.,  9., 10., 11., 12.],
       [ 4.,  5.,  6.,  7.,  8.,  9., 10., 11., 12., 13.],
       [ 5.,  6.,  7.,  8.,  9., 10., 11., 12., 13., 14.],
       [ 6.,  7.,  8.,  9., 10., 11., 12., 13., 14., 15.],
       [ 7.,  8.,  9., 10., 11., 12., 13., 14., 15., 16.],
       [ 8.,  9., 10., 11., 12., 13., 14., 15., 16., 17.],
       [ 9., 10., 11., 12., 13., 14., 15., 16., 17., 18.],
       [10., 11., 12., 13., 14., 15., 16., 17., 18., 19.]])

It is slightly different from your array, but I think this is what you wanted?