0

Why the pendulum has cos and sin feature? Can I just use 1 of them? Or can I use theta (the angle) instead?

I expect some explanation for this XD, intuitive or theoretical ones are all welcome.

1 Answers1

0

The angles(thetas) are passed through the sin() and cos() function so that the observations are in the range [-1,1]. This fixed range of [-1,1] helps in stabilising the training in the neural networks which has been explained well here.

You could even use one of the sin() or cos() as your observation. The reason(which I can think of) for using both sin() and cos() is probably to give more information about the state. Maybe using both sin() and cos() leads to a faster convergence.

But normalisation of the inputs is necessary. So, you cannot just use the angles as your state observations for training.

Edit: Answer to the comment by @CHEN TIANRONG Plots

I ran DDPG with just sin() and theta_dot in one experiment and with sin(), cos() and theta_dot in another experiment. Clearly the agent never learns the task in the first experiment.

The usage of both sin() and cos() is experimental I guess.

You can find the code I used for the experiments here.

Improving the rate of convergence of a neural network for RL agents is an active area of research. You could search for algorithms which are sample efficient. For example: Sample-Efficient Reinforcement Learning with Stochastic Ensemble Value Expansion, Sample Efficient Actor-Critic with Experience Replay, etc.

nsidn98
  • 1,037
  • 1
  • 9
  • 23
  • Awesome answer. You said the If we use two of them, it will lead to relatively fast convergence? How do you get this point? I mean.... how can I find out some way to accelerate the convergence of neural network, just as you said, more features. Are there any papers explain that? – BayesFans Jul 05 '19 at 01:50
  • Answered your question in the comment in the edits. – nsidn98 Jul 08 '19 at 12:01
  • What a good answer! Thanks! I will check these two paper later. Thanks!!! – BayesFans Jul 09 '19 at 19:55