0

Given:

  • A discrete probability distribution over n-bit integers represented as a list ys of floats (i.e. sum(ys) == 1)
  • a parameter eps such that 0 < eps < 0.5

I want to generate a new probability distribution over n-bit integers, xs, such that the following holds:

(1) (forall) i in range(0, 2**n): 
    (1 - eps) <= xs[i] / ys[i] <= (1 + eps)

(2) sum(xs) == 1

My (naive) starting point is this:

for i in range(2**n):
    xs[i] = uniform(ys[i] * (1 - eps), ys[i] * (1 + eps))

but obviously sum(xs) may not be 1, and normalizing xs may break condition (1).


It seems to me that there should be a way to generate such a list where both constraints are met. Am I missing something fundamental here?

Alexandru Dinu
  • 1,159
  • 13
  • 24
  • You can get more accurate results if you use fractions rather than floating-point numbers (use the `fractions` module; e.g., `fractions.Fraction(5,100)`). Then, there wouldn't be a need for the `eps` parameter if all it serves is to bound the floating-point error. – Peter O. Jan 21 '21 at 21:06
  • See also my answer at: https://stackoverflow.com/a/61525097/815724 – Peter O. Jan 21 '21 at 21:09
  • The goal here is to have two distributions which are "multiplicatively" close, meaning that for all items in the support, the ratio is within `1-eps` and `1+eps`. I am not concerned with floating-point error. – Alexandru Dinu Jan 21 '21 at 21:09
  • 1
    for each xs[i] limit the error to eps/2**n. After they are generated, normalization will work. I don't think you can have uniform on entire ys[i] * (1 - eps), ys[i] * (1 + eps) – Bing Wang Jan 21 '21 at 21:10
  • 1
    a looser bound is to limit the error to eps/n. This by law of large numbers, the errors will cancel enough other and 'mostly' it should work. If it doesn;t regenerate. – Bing Wang Jan 21 '21 at 21:14

0 Answers0