-1

I am dealing with a deep reinforcement learning problem and the state I need to feed to my agent is contained in a vector of binary numbers.

The list looks like that:

[7.0, 1.0, 1.0, 0.0, 1.0, 5.0, 0.0, 1.0, 0.0, 1.0, 
 7.0, 1.0, 1.0, 0.0, 1.0, 6.0, 1.0, 0.0, 1.0, 0.0]

However, each complete state for my problem is contained every 5th iteration. Examples of complete states from the sample data are:

[[7. 1. 1. 0. 1.]]
[[5. 0. 1. 0. 1.]]
[[7. 1. 1. 0. 1.]]
[[6. 1. 0. 1. 0.]]

I have tried creating a parser function, similar to a sliding window which should capture the 5 values every 5th iteration.

def getState(data, timestep, window):
    parser_start = timestep - window + 1
    block = data[parser_start:timestep + 5] if parser_start >= 0 else data[0:timestep + 5] # pad with t0
    res = []
    for i in range(window - 1):
        res.append(block[i])

    return np.array([res])

to then implement into a for loop of the type:

window_size = 5    
for t in range(10):
        next_state = getState(data, t + 4, window_size + 1)    
        print(next_state)

However, when running the loop the result I get is:

[[7. 1. 1. 0. 1.]]
[[1. 1. 0. 1. 5.]]
[[1. 0. 1. 5. 0.]]
[[0. 1. 5. 0. 1.]]
[[1. 5. 0. 1. 0.]]
[[5. 0. 1. 0. 1.]]
[[0. 1. 0. 1. 7.]]
[[1. 0. 1. 7. 1.]]
[[0. 1. 7. 1. 1.]]
[[1. 7. 1. 1. 0.]]

It seems to append a sliding window of 1, rather than 5. I have been trying for weeks now but I can't find where the problem is.

Do you guys have any fresh ideas?

1 Answers1

0

the window size should be in step

[data[i:i+5] for i in range(0,len(data),5)]

[[7.0, 1.0, 1.0, 0.0, 1.0],
 [5.0, 0.0, 1.0, 0.0, 1.0],
 [7.0, 1.0, 1.0, 0.0, 1.0],
 [6.0, 1.0, 0.0, 1.0, 0.0]]
galaxyan
  • 5,944
  • 2
  • 19
  • 43
  • If this is what the OP is asking for then this is a clear duplicate, please vote to close instead of answering. – juanpa.arrivillaga Sep 10 '19 at 22:40
  • Thanks for the comment, but unfortunately I can't plug this into a sliding window loop. Or I can't see why (I am fairly new to python programming). I am looking for something which at each iteration uploads i (in your example). So, if i is data[0:5] in the first iteration, at the second, I need it to be data[5:10] – Antonio Marchi Sep 10 '19 at 23:14
  • in other words, I need 4 different arrays, rather than one big array with all the vectors – Antonio Marchi Sep 10 '19 at 23:20