When studying Reinforcement learning, and exactly when it comes to Model-Free RL, there are two methods we use generally:
- TD learning
- Monte Carlo
When is each one of them used over the other? In other words, how do we figure out what method is best for our problem?