1

I have a dataset that contains a multidimensional array of shape (2400, 2).

I want to be able to take each of these 2400 rows, and modify them to be a range from the start and end points (the two elements in each of the 2400 rows). The range is always the same length (in my case, a length of 60).

For example, if I have something like this:

array([[  78,   82],
       [  90, 94],
       [  102, 106]])

My output should be something like this:

array([[  78, 79, 80, 81, 82],
       [  90, 91, 92, 93, 94],
       [  102, 103, 104, 105, 106]])

The only way I have been able to do this is with a for loop, but I am trying to avoid looping through each row as the dataset can get very large.

Thanks!

Pierpressure
  • 25
  • 1
  • 6

2 Answers2

0

If the difference between the second column and first column is always 4, then you can extract the first column and add an array of [0,1,2,3,4] to it:

arr = np.array([[  78,   82],
                [  90, 94],
                [  102, 106]])

arr[:,:1] + np.arange(5)
Out[331]:
array([[ 78,  79,  80,  81,  82],
       [ 90,  91,  92,  93,  94],
       [102, 103, 104, 105, 106]])
Psidom
  • 209,562
  • 33
  • 339
  • 356
0

Since by necessity all of the aranges need to be equally long, we can create an arange along the first entry and then replicate it for the others.

For example:

x = np.array([[78, 82],
              [90, 94],
              [102, 106]])

>>> x[:, :1] + np.arange(0, 1 + x[0, 1] - x[0, 0])
# array([[ 78,  79,  80,  81],
#        [ 90,  91,  92,  93],
#        [102, 103, 104, 105]])
Jonas Adler
  • 10,365
  • 5
  • 46
  • 73