Preventing overflows and preserving precision when scaling (long-)integers

Question

Let's say I have a position pos within a given range, such that:

0 <= pos < range

This position within the range can consist in two different contexts, one where the range is an integer value, i.e. pos < range < 2³¹, and the other where the range is a long-integer, i.e. up to pos < range < 2⁶³. If I want to move between these contexts, I need to scale the position to the new range such that it is correctly rounded down to the nearest (long-)integer value. So, technically, all I want to do is:

pos_new = floor( pos_old * range_new / range_old )

Unfortunately, this straight forward approach doesn't do the trick since it either overflows (as pos_old * range_new can be as large as ~2⁹⁴) if I do the multiplication first or gives me rounding errors if I do the division first. Using floating point values to do the math doesn't help in general either, as they don't offer enough bits of precision and might therefore also lead to incorrect rounding (I only have double-precision available).

I found a way to scale from the integer range to the long-integer range correctly:

public long scaleUp(int oldPos, int oldRange, long newRange) {
    return (newRange / oldRange) * oldPos + 
           (newRange % oldRange) * oldPos / oldRange;
}

This ensures that the calculation neither overflows the limits of a long-integer at any point, nor looses accuracy due to premature rounding (the modulus captures the part lost to rounding in the first division).

What I'm trying to figure out now, is a way to do the reverse scaling:

public int scaleDown(long oldPos, long oldRange, int newRange) {
    return ??? ;
}

Not sure if this should be any more difficult than the other function, but somehow I'm not seeing it.

A couple of remarks:

I'd like to avoid using floating point arithmetic, since I always find it very hard to convince myself that it is truly not possible for a given formula to yield an unexpected result in some highly infrequent cases due to rounding
I'd rather not use a BigInteger library
While the code-samples here are Java, this really is a language-agnostic question

It's more language-specific than you think: IIRC java makes it somewhat more difficult/inefficient to use basic tools like counting the leading zeroes of a number, or obtaining the 128-bit product of two 64-bit integers. Or even being able to do unsigned arithmetic! (not to mention some languages have arbitrary-precision integers built in; it's hard to imagine a python implementation could possibly be any better than `p * r // R`) — , Jul 04 '13 at 08:42

score 0 · Answer 1 · edited Apr 13 '17 at 12:19

I figured out an answer that is not 100% complete, but covers all special cases that occur in my program. You can find the details of the derivation in the answer I posted to my corresponding question on the Mathematics StackExchange: https://math.stackexchange.com/q/433729/84557

Here's the rough outline:

public int scaleDown(long oldPos, long oldRange, int newRange) {
    if (oldPos <= Long.MAX_VALUE/newRange)
        return (int) (oldPos*newRange/oldRange);
    assert oldRange >= newRange*newRange : "Case not supported yet"; // Never happens in my code
    int newPos = (int) (oldPos / (oldRange/newRange));
    if (!isOk(newPos)) newPos--; // Check might be implementation specific
    return newPos;
}

Not complete, but maybe it's useful for someone anyways.

score 0 · Answer 2 · edited May 23 '17 at 12:34

0

I am very, very late to this party, but I ran into the exact same problem.

Note that I have been using the terms x for your pos, y for your oldRange, and w for your newRange, i.e. solving for z in the equation x/y = z/w.

Here are the solutions I've considered:

I could use floating points, but then whether the accuracy is acceptable for 64-bit integers becomes entirely dependent on whether the platform supports floats with a mantissa of at least 64 bits. (Many platforms do not.)

I could redefine the the input domain to one that's smaller than the output range by no more than half and then scale up - i.e. q = y / w; scaleUp(x / q / 2, y / q / 2, w);, and in the worst case the lowest-order bit would be entirely lost and in the best case no information would be lost at all, but I don't want to lose the lower-order bit in any case.

I could use a sort of binary search to find the value to subtract from x / (y / w) in order to compensate for error, but I don't consider the performance impact of using a loop or recursion to be acceptable.

Right now I the best I've come up with is to normally compute x / (y / w) - (y % w) * x / y and when the multiplication would overflow - which, granted, should only be for a relatively small domain of inputs (certainly fewer inputs than just computing x * w / y) - use an algorithm such as this one to capture the high bits of the product.

But I still really feel like there has got to be a simpler way to do this.

edited May 23 '17 at 12:34

Community

1
1

answered Feb 20 '17 at 00:02

btd

684
6
15

Looks like your formula is kind-of like the first step in the recursive solution that I posted here: http://math.stackexchange.com/questions/433729/scaling-numbers-cleverly-to-prevent-arithmetic-overflows-or-rounding-errors (the last formula under UPDATE) – Markus A. Feb 20 '17 at 17:19
I approached it the same way I did the case where `y <= w`, which was to determine an approximate answer with simple division and with an added (in the case of `y <= w`) or subtracted (in the case of `y > w`) error compensation term. I came up with `int q = y / w` then `x / q - (y / q - w) * x / y`. Essentially, this reduces the range `x in [0, y)` to `x in [0, n)` where `w <= n < 2w` without accuracy loss. It doesn't eliminate the opportunity for overflow, but greatly reduces the domain of inputs for which it occurs. The operation is equivalent to `x / (y / w) - (y % w) * x / y`. – btd Feb 20 '17 at 17:24
Oh, no, I'm completely dense and it can't be rewritten as modulo. That's my bad. `x / q - (y / q - w) * x / y` full stop. (Though I've also been investigating the option of writing `x / q - (y / q - w) * (x / q) / (y / q)` to further reduce the domain of inputs causing overflow - I think but am not 100% sure that this wouldn't reduce the accuracy of the result.) – btd Feb 20 '17 at 17:26
Is your case also such that you can be sure that `y >= w^2`? If so, you should be able to make the solution that I posted above work. If not, for the cases where `y < w^2`, you can fall back to an arbitrary precision math library (or a simplified version thereof that allows you to do math up to 128-bit ints)... – Markus A. Feb 20 '17 at 17:28
But if you find a more elegant answer, please do let me know! :) I've been racking my brain about this problem for the better part of a month back in the days and would love a more complete solution. – Markus A. Feb 20 '17 at 17:30
(I found that thinking about the problem with the implicit rounding due to divisions shown explicitly as I did here: http://math.stackexchange.com/a/435903/84557 to be quite helpful for this...) – Markus A. Feb 20 '17 at 17:31

Falk Hüffner · Answer 3 · 2020-04-19T08:07:10.367

0

EDIT: This solution is not correct, see discussion in the comments.

I have a solution without special cases. I don't have a proof of correctness, but it looks plausible and I couldn't find a counterexample.

int scaleDown2(long longPos, long longRange, int shortRange) {
    int p = longPos / (longRange / shortRange);
    return p - ((p * (longRange % shortRange)) / longRange);
}

edited Apr 19 '20 at 08:07

answered Apr 18 '20 at 08:41

Falk Hüffner

4,942
19
25

Thank you for the reply! I'll try and see if I can prove that this works. – Markus A. Apr 18 '20 at 15:25
Looks like this formula is actually not correct. A very simple counter-example is that `scaleDown2(1, 3, 2)` returns 1, but the actual answer should be 0, since 1*2/3 = 0.6666..., which should round down to 0. – Markus A. Apr 18 '20 at 16:07
It seems I misunderstood what function you're looking for then. I was trying to literally reverse scaleUp. For scaleUp(X, 2, 3) to yield 1 we need X=1, so I assumed that's what we're looking for. – Falk Hüffner Apr 18 '20 at 17:42
That makes sense. I guess I should have specified that `scaleDown` isn't necessarily the inverse of `scaleUp`. I do -- or I guess "did" in 2013 ;) -- need the rounded down result as per the formula for posₙₑᵥᵥ. I was using these mappings to implement a range coding algorithm. – Markus A. Apr 18 '20 at 22:24

Preventing overflows and preserving precision when scaling (long-)integers

3 Answers3