Updating alpha and beta parameters for Beta distribution with more and more feedback

Question

I am working on ranking online content based on customer feedback for my college project. For that, I associate each content with a prior alpha and beta parameter and update those based on the feedback I get. As I simulate more and more trials, the values for alpha and beta parameters keep on increasing. I want my model to be more reactive to the recent customer behavior so in my updates, I decay prior parameters by a factor of 0.9 and sum the alpha, beta from the last day (as a first order inhomogeneous linear difference equation).

Due to the decay, the model forgets that some content was suboptimal and tries to explore it again leading to some cyclic behavior. Is there any better way to solve this? I tried just looking at last month of data to build my distribution but that seems to be "forgetful" too. How do I prevent alpha/beta from getting too large, while ensuring the model is reactive and doesn't forget suboptimal strategies?

score 0 · Answer 1 · answered Feb 13 '20 at 10:30

Whatever changes you make to your model, there's always going to be a trade-off between how reactive it is and how much memory it retains. It will not be possible for a model to retain everything and still catch up to customer behaviour. For example, if the model retains everything it would find no reason to try other arms even if the customer behaviour has changed. On the other hand, to stay reactive, the model does need to keep trying sub-optimal arms to check if one of them hasn't become optimal even though this might make it incur some extra regret. Note that in a non-stationary setting, it won't be possible to perform as well as the stationary settings.

You have tried both the standard ways of giving more weight to newer data: discounting (with a factor of 0.9) and considering data only from the last n days. If you find that using these parameter values gives you models that are too forgetful, you can try increasing the discount factor or the number n (days that you consider).

As you increase these parameters, your models will become less forgetful and less reactive. You need to find values that work for you. Also, it might not be possible to achieve both the amount of reactivity and forgetfulness you are hoping for at the same time.

Hope this helps!

Updating alpha and beta parameters for Beta distribution with more and more feedback

1 Answers1