In the context of Double Q or Deuling Q Networks, I am not sure if I fully understand the difference. Especially with V. What exactly is V(s)? How can a state have an inherent value?
If we are considering this in the context of trading stocks lets say, then how would we define these three variables?