I have a dataset with different email codes, email recipients and a flag of whether they responded to the email. I calculated the past response rates for each person, for the emails preceding the current email (sum of responses / number of emails). It looks something like this:
email_code responded person number_of_emails response_rate date
wy2 1 A 0 0 2022/01/12
na3 1 A 1 100 2022/01/22
li3 0 A 2 100 2022/01/23
pa4 1 A 3 66 2022/01/24
However, this doesn't seem right. Imagine that person A received 1 email and replied to it, so their response rate will be 100%. Person B received 10 emails and replied to 9 of them, so their response rate will be 90%. But person B is more likely to respond.
I think I need to calculate some Bayesian average, in a similar vein to this post and this website. However, these websites show how to do this for ratings, and I do not know how I can adapt the formula to my case.
Any help/suggestions would be greatly appreciated!