i have a time series dataset as a numpy array of shape:
(batch_size , observations , sensor_number
So for example:
(3,10,2)
3 batches of two sensors, each having time series data of length 10.
On that numpy array i now want to reshape the length
of the time series as well as specify a overlapping
factor.
So here is an example, trying to change the original dataset from above:
The new period length of each sample should be 5
and i want the samples to overlapp by 0.4 (40%). For simplicity the time series data are from 1...10
The original dataset of shape (3,10,2) looks like:
array([[[ 1, 2],[ 3, 4],[ 5, 6],[ 7, 8],[ 9, 10],
[ 1, 2],[ 3, 4],[ 5, 6],[ 7, 8],[ 9, 10]],
[[ 1, 2],[ 3, 4],[ 5, 6],[ 7, 8],[ 9, 10],
[ 1, 2],[ 3, 4],[ 5, 6],[ 7, 8],[ 9, 10]],
[[ 1, 2],[ 3, 4],[ 5, 6],[ 7, 8],[ 9, 10],
[ 1, 2],[ 3, 4],[ 5, 6],[ 7, 8],[ 9, 10]]])
I would expect the new, reshaped numpy array to have the shape:
(6,5,2)
. Each chunck will be windowed like shown below:
Overlapping: For the new target length of 5 a 40% overlapping means that 2 elements from the previous sample are overlapping into the next sample.
So reshaped with only the valid length time series elememts means in the case above to double the original data amount by slicing the original time series two a shorter time series with a overlapping between the samples.
I tried two reshape it by iterating through all elememts in a for loop but it takes so much time so i think there must be a more performant way of e.g. vectoricing the operation.
Can anyone please help and give hints on how to do that? Thanks in advice.
#1: 1...10 , #2: 5...15, #3: 10...20. So i increase the original batch size of 3 (in (3,20,2)) to 6 and the new shape of my dataset will be (6,10,2). – deniz Oct 31 '20 at 21:08