I have been trying to implement the Reinforcement learning algorithm on Python using different variants like Q-learning
, Deep Q-Network
, Double DQN
and Dueling Double DQN
. Consider a cart-pole example and to evaluate the performance of each of these variants, I can think of plotting sum of rewards
to number of episodes
(attaching a picture of the plot) and the actual graphical output where how well the pole is stable while the cart is moving.
But these two evaluations are not really of interest in terms to explain the better variants quantitatively. I am new to the Reinforcement learning and trying to understand if any other ways to compare different variants of RL models on the same problem.
I am referring to the colab link https://colab.research.google.com/github/ageron/handson-ml2/blob/master/18_reinforcement_learning.ipynb#scrollTo=MR0z7tfo3k9C for the code on all the variants of cart pole example.