Is any version of multiarm bandit (EpsilonGreedy, Thompson Sampling, UCB) any good when there is very low reward/click rate for the high pull rate. I have 600 piece of content with approximately 3000 clicks (total across all content) per day for a volume of approximately million requests. With this would it be useful to implement MAB, is this rate of click any statistical significance for the algorithm.
Asked
Active
Viewed 48 times
1
-
This question is off-topic on SO, try [ai.SE](https://ai.stackexchange.com/) or [stats.SE](https://stats.stackexchange.com). – cheersmate Dec 11 '18 at 08:03
-
for click rate prediction, you could look at Factorization machines – Venkatachalam Dec 11 '18 at 08:39
1 Answers
1
Do the 600 pieces of content change every day or do they stay the same? If they stay the same, then an asymptotically optimal algorithm would start performing extremely well soon enough.
Even if the pieces of content change, Thompson Sampling should still work and give you something which significantly better than random. I have run various experiments with Thompson Sampling for my research and it seems to start doing well very quickly on most of them.

Sanit
- 80
- 9