2

I have been tossing around a conceptual idea for a machine (as in a Turing machine) and I'm wondering if any work has been done on this or related topics.

The idea is a machine that takes an entropy stream and gives out random symbols in any range without losing any entropy.

I'll grand that is a far from rigorous description so I'll give an example: Say I have a generator of random symbols in the range of 1 to n and I want to be able to ask for a symbols in any given range, first 1 to 12 and then 1 to 1234. (To keep it practicable I'll only consider deterministic machines where, given the same input stream and requests, it will always give the same output.) One necessary constraint is that the output contain at least as much entropy as the input. However, the constraint I'm most interested in is that the machine only reads in as much entropy as it spits out.

E.g. if asked for tokens in the range of 1 to S1, S2, S3, ... Sm it would only consume ceiling(sum(i = 1 to m, log(Si))/log(n)) inputs tokens.

This question asks about how to do this conversion while satisfying the first constraint but does very badly on the second.

Community
  • 1
  • 1
BCS
  • 75,627
  • 68
  • 187
  • 294

2 Answers2

1

Okay, I'm still not sure that I'm following what you want. It sounds like you want a function

f: I → O

where the inputs are a strongly random (uniform distribution etc) sequence of symbols on an alphabet I={1..n}. (So a series of random natural numbers ≤ n.) The outputs are another sequence on O={1..m} and you want that sequence to have as much entropy as the inputs.

Okay, if I've got this right, first off, if m < n, you can't. If m < n then lg m < lg n, so the entropy of the set of output symbols is smaller.

If mn, then you can do it trivially by just selecting the ith element of {1..m}. Entropy will be the same, since the number of possible output symbols is the same. They aren't going to be "random" in the sense of being uniformly distributed over the whole set {1..m}, though, because necessarily (pigeonhole principle) some symbols won't be selected at all.

If, on the other hand, you'd be satisfied with having a random sequence on {1..m}, then you can do it by selecting an appropriate pseudorandom number generator using your input from the random source as a seed.

Charlie Martin
  • 110,348
  • 25
  • 193
  • 263
  • Almost, if m>n you input fewer symbols than you output (count in need not equal count out) same the other way (more inputs than outputs) also the output range is non-constant (element 1 is on {1..12}, element 2 is on {1..1243}, etc.) – BCS Apr 13 '09 at 03:20
  • Yes, I do require uniform distribution on the output (assuming it is to be had on the input). – BCS Apr 13 '09 at 03:33
1

My current pass at it:

By adding the following restriction: you know in advance what the sequence of ranges is {S1, S2, S3, ..., Sn}, than using base translation with a non-constant base might work:

  1. Find Sp = S1 * S2 * S3 * ... * Sn
  2. Extract m=cealing(log(Sp)/log(n)) terms from the input {R1, R2, R3, ..., Rm}
  3. Find X = R1 + R2*n + R3*n^2 + ... + Rm*n^(m-1)
  4. Reform X as O1 + S1*O2 + S1*S2*O3 + ... Sn*On + x where 1 <= Oi <= Si

This might be reformable into a solution that works for one value at a time by pushing x back into the input stream. However I can't convince my self that even the known outputs range form is sound so...

BCS
  • 75,627
  • 68
  • 187
  • 294
  • Can you please define O1 and On here clearly, and how they come as a big O notation. How do you find the upper bound Si. – Léo Léopold Hertz 준영 Aug 09 '16 at 12:56
  • I think your current deterministic approach will always fail. I think you need some statistical adjustments to avoid the worst case scenario. - - I would love to understand better why you think the deterministic approach can be sufficient here. – Léo Léopold Hertz 준영 Aug 09 '16 at 13:02