I am currently working on optimizing reward values for the Q-Learning I'm doing. So right now I consider two values that calculate a specific reward value. Since this is work related i can't specify the variable names i take into consideration. the reward takes the form: reward = a + b
where a
takes values from a list: [10, 20, 40, 60, 80]
and b
can be any value ranging from 0 to infinity
ie b ε [0,∞)
. Even though the value of b will not be so large, it can take any value within the range.
So the situation is such that: if the b is something like b=1300
and a=80
, the reward = 1380
where the priority of value a
gets eclipsed by b
. Is there someway I can formulate reward such that both the values of a and b have equal priority like both having 50% value while calculating reward?