1

I have a dataset with different email codes, email recipients and a flag of whether they responded to the email. I calculated the past response rates for each person, for the emails preceding the current email (sum of responses / number of emails). It looks something like this:

email_code  responded  person  number_of_emails  response_rate  date
wy2         1         A       0                 0              2022/01/12
na3         1         A       1                 100            2022/01/22
li3         0         A       2                 100            2022/01/23
pa4         1         A       3                 66             2022/01/24   

However, this doesn't seem right. Imagine that person A received 1 email and replied to it, so their response rate will be 100%. Person B received 10 emails and replied to 9 of them, so their response rate will be 90%. But person B is more likely to respond.

I think I need to calculate some Bayesian average, in a similar vein to this post and this website. However, these websites show how to do this for ratings, and I do not know how I can adapt the formula to my case.

Any help/suggestions would be greatly appreciated!

fifigoblin
  • 395
  • 1
  • 8
  • Interesting question, however, it's off topic here since it's a discussion question about the general approach. Try stats.stackexchange.com instead. – Robert Dodier Jan 24 '22 at 16:33

1 Answers1

1

The post on SO perfectly describes how you can calculate the Bayesian rating, IMO.

I quote:

rating = (v / (v + m)) * R +
         (m / (v + m)) * C;

The variables are:

  • R – The item's own rating. R is the average of the item's votes. (For example, if an item has no votes, its R is 0. If someone gives it 5 stars, R becomes 5. If someone else gives it 1 star, R becomes 3, the average of [1, 5]. And so on.)
  • C – The average item's rating. Find the R of every single item in the database, including the current one, and take the average of them; that is C. (Suppose there are 4 items in the database, and their ratings are [2, 3, 5, 5]. C is 3.75, the average of those numbers.)
  • v – The number of votes for an item. (To given another example, if 5 people have cast votes on an item, v is 5.)
  • m – The tuneable parameter. The amount of "smoothing" applied to the rating is based on the number of votes (v) in relation to m. Adjust m until the results satisfy you. And don't misinterpret IMDb's description of m as "minimum votes required to be listed" – this system is perfectly capable of ranking items with less votes than m.

So in your case:

  • R is the response rate or number of replies / number of received emails. If someone hasn't received any emails set Rto0to avoid divison by zero. If the haven't responded to any received emails theirR` is of course zero.

  • C, is the sum of Rs of all recipients divided by the number of all recipients.

  • v, is the number of received emails. If someone received 10 emails, their v will be 10. If the haven't received any emails, their v will be zero.

  • m, is, as described in the original post, the tuneable parameter.

Further quote from the original post which describes m very well:

All the formula does is: add m imaginary votes, each with a value of C, before calculating the average. In the beginning, when there isn't enough data (i.e. the number of votes is dramatically less than m), this causes the blanks to be filled in with average data. However, as votes accumulates, eventually the imaginary votes will be drowned out by real ones.

user1984
  • 5,990
  • 2
  • 13
  • 32
  • Thank you very much! I just have one question, about `m`. In my dataset, the minimum number of received emails is 0, and maximum is 15. So should `m` be a number between 0 and 15? – fifigoblin Jan 24 '22 at 16:24
  • 1
    I think for `m` to be meaningful it needs to be between 0 and 15. But as the post explains, `m` doesn't disqualify items that have less votes than `m`. It just fills their lack of required votes with the overal average vote the whole population. This means that if an email has received zero emails, it's rating will be roughly the average of the whole. As it receives votes/emails the impact of `m` is smoothed out. I think playing a little with `m` and looking at the data and tuning it over time would be a good first approach. – user1984 Jan 24 '22 at 16:39