The maximum value of this function basically depends on the precision limit; that is, how arbitrarily close to the limits (u -> 0)
or (1 - p -> 1)
the fixed point values can be.
If we assume (k)
fractional bits, e.g., with the limits: u = (2^-k)
and 1 - p = 1 - (2^-k)
,
then the maximum value is: k / (k - log2(2^k - 1))
(As the ratio of natural logarithms, we are free to use any base e.g., lb(x)
or log2
)
Unlike njuffa's answer, I went with a lookup table approach, settling on k = 10
fractional bits to represent 0 < frac(u) < 1024
and 0 < frac(p) < 1024
. This requires a log table with 2^k
entries. Using 32-bit table values, we're only looking at a 4KiB
table.
Any more than that, and you are using enough memory that you could seriously consider using the relevant parts of a 'soft-float' library. e.g., k = 16
would yield a 256KiB
LUT.
We're computing the values - log2(i / 1024.0)
for 0 < i < 1024
. Since these values are in the open interval (0, k)
, we only need 4 binary digits to store the integral part. So we store the precomputed LUT in 32-bit [4.28]
fixed-point format:
uint32_t lut[1024]; /* never use lut[0] */
for (uint32_t i = 1; i < 1024; i++)
lut[i] = (uint32_t) (- (log2(i / 1024.0) * (268435456.0));
Given: u, p
represented by [0.10]
fixed-point values in [1, 1023]
:
uint32_t func (uint16_t u, uint16_t p)
{
/* assert: 0 < u, p < 1024 */
return lut[u] / lut[1024 - p];
}
We can easily test all valid (u, p)
pairs against the 'naive' floating-point evaluation:
floor(log(u / 1024.0) / log(1.0 - p / 1024.0))
and only get a mismatch (+1 too high) on the following cases:
u = 193, p = 1 : 1708 vs 1707 (1.7079978488147417e+03)
u = 250, p = 384 : 3 vs 2 (2.9999999999999996e+00)
u = 413, p = 4 : 232 vs 231 (2.3199989016957960e+02)
u = 603, p = 1 : 542 vs 541 (5.4199909906444600e+02)
u = 680, p = 1 : 419 vs 418 (4.1899938077226307e+02)
Finally, it turns out that using the natural logarithm in a [3.29]
fixed-point format gives us even higher precision, where:
lut[i] = (uint32_t) (- (log(i / 1024.0) * (536870912.0));
only yields a single 'mismatch', though 'bignum' precision suggests it's correct:
u = 250, p = 384 : 3 vs 2 (2.9999999999999996e+00)