I am trying to implement a simple case of deep Q learning in R, using the neuralnet
package.
I have an initial network with initial random weights. I use it to generate some experience for my agent and as a result, I get states and targets. Then I fit the states to the target and get a new network with new weights.
How do I have to combine the new weights and the initial weights? Do I simply keep the new weights and discard the initial weights?