My question is why when I train the same algorithm twice, it gives different results each time I train it? Is it normal or there might be a problem in data or code?
The algorithm is deep deterministic policy gradient
.
My question is why when I train the same algorithm twice, it gives different results each time I train it? Is it normal or there might be a problem in data or code?
The algorithm is deep deterministic policy gradient
.
It's absolutely normal. There is no problem with either data or code.
The algorithm may be initialized to a random state, such as the initial weights in an artificial neural network. Try setting numpy seed for result reproducibility as below:
import numpy as np
np.random.seed(42)
Learn more about this from here.
When you initialize weights to your model, they are often initialized randomly by whatever you use, most likely np.random.rand(), and therefore yields different results every time.
If you do not want to randomize weights, use np.random.seed(10) to always get the same results. If using any other library, I'm sure there are equal commands.
Edit: i saw you used tensorflow, in that case:
tf.random.set_random_seed(10)