4

It is easy to find an integer with binary search even if it can be arbitrarily large: first guess the order of magnitude, then keep dividing the interval. This answer describes how to find an arbitrary rational number.

Having set the scene, my question is similar: how can we guess a IEEE 754 floating point number? Assume it is not NaN, but everything else is fair game. For each guess, your program will be told whether the number in question is higher, equal or lower. Minimize the number of guesses required in the worst case.

(This is not a homework assignment. Though, I might make it one, if this turns out to have an interesting answer that's not just "beat the floating point numerical difficulties to death with lots and lots of special case handling.")

Edit: if I were better at searching I could have found the answer---but that only works if you already know that reinterpretation as int works (with certain caveats). So leaving this up. Thanks to Harold for a great answer!

Matthias
  • 133
  • 2
  • 10
  • 1
    Are we allowed to reinterpret the bit-pattern of floats as ints? – harold Jul 09 '17 at 00:11
  • Sure, whatever you want. But the oracle will give you comparison result when interpreted as float. – Matthias Jul 09 '17 at 00:12
  • Why "first guess the order of magnitude" ? What dos that have to do with binary search? – H H Jul 09 '17 at 00:32
  • @HenkHolterman See the new link I added, I hope that makes the question clearer? Summary: to use the classic binary search you first need finite bounds. An integer in the mathematical sense doesn't come with any finite bounds a priori, it can be arbitrarily large or small. So you need to guess them, before you can start dividing the interval. – Matthias Jul 09 '17 at 07:48
  • @Matthias: so this is more a binary guess than a binary search. Binary searches search through a (sorted) list (or array or other container) of items. There, the interval is given by the largest and smallest values. – Rudy Velthuis Jul 10 '17 at 15:44
  • @Rudy, I grew up in academia were we routinely use the term binary search with this broader meaning. See eg https://stackoverflow.com/a/27354580/590534 for someone else doing the same. But, feel free to suggest an edit to the question. The term 'binary guess' doesn't seem to be common in the literature however. – Matthias Jul 10 '17 at 16:23
  • @Matthias: [See here](https://en.wikipedia.org/wiki/Binary_search_algorithm): "In computer science, binary search, also known as half-interval search, logarithmic search, or binary chop, is a search algorithm that finds the position of a target value within a sorted array." – Rudy Velthuis Jul 10 '17 at 16:32
  • @Rudy, I saw that one too. And for an intro article Wikipedia is good enough for the layman here, but you can see whether you can/want to improve the article to be more in line with actual usage in CS. The [Talk section](https://en.wikipedia.org/wiki/Talk:Binary_search_algorithm#terrible_state_of_this_article) has lots more ideas for how to improve the article. See also how [bisection method](https://en.wikipedia.org/wiki/Bisection_method) mentions the wider meaning of binary search. – Matthias Jul 10 '17 at 17:39
  • @Matthias: I am a layman too (I'm a programming dentist). What you are asking about is generally called a "number guessing game" and the binary strategy is just the most efficient one. – Rudy Velthuis Jul 10 '17 at 17:44
  • @Rudy, I guess I was being a bit too polite to Wikipedia: I mean, the article has serious issues and should not be used as a reference on its own. In any case, please feel free to propose a change in formulation to the question. (I'm a mathematician by training myself, only got into computing because it's so easy.) – Matthias Jul 10 '17 at 17:49
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/148812/discussion-between-matthias-and-rudy-velthuis). – Matthias Jul 10 '17 at 17:49
  • 1
    @Matthias: No need. It isn't too important. I just meant I wouldn't call it a *search*, although the principle is the same: repeatedly chop in halves. – Rudy Velthuis Jul 10 '17 at 18:11

2 Answers2

6

IEEE-754 64-bit floating point numbers are really 64-bit representations. Furthermore, with the exception of NaN values, there is no difference between floating point comparison and integer comparison of positive values. (That is, two bit patterns with the sign bit unset will produce the same comparison result regardless of whether you compare them as int64_t or double, unless one of the bit patterns is a floating point NaN-.)

That means you can find a number in 64 guesses by guessing one bit at a time, even if the number is ±∞. Start by comparing the number with 0; if the target is "less", then produce the guesses in the same way as below, but negate them before guessing. (Since IEEE-754 floats are sign/magnitude, you can negate the number by setting the sign bit to 1. Or you could do the positive bit-pattern reinterpretation and then floating point negate the result.)

After that, guess one bit at a time, starting with the highest-order value bit. Set that bit to 1 if the number is greater than or equal to the guess; set that bit to 0 if the number is less; and continue with the next bit until there aren't any more. To construct the guess, reinterpret the bit pattern as a double.

There are two caveats:

  1. You cannot distinguish between ±0 with comparison tests. That means that if your opponent wants you to distinguish between them, they will have to supply you with a way to ask about equality with −0, and you'll have to use that mechanism after you've apparently established that the number is 0 (which will happen on the 64th guess). This would add one guess, for a total of 65.

  2. If you are assured that the target is not a NaN, then there is no other problem. If it might be a NaN, you need to be careful how you compare: things will work out fine if you always ask "is X less than this guess?", because a NaN comparison will always return false. That means that after 11 successive "no" answers (not counting the one to establish the sign), you will find yourself guessing ∞, with the assumption that if the number is not less than ∞, it must be equal. However, in this case alone you need to explicitly test for equality as well, because that will also be false if the target is a NaN. This doesn't add an additional guess to the count, because it will always happen long before 64 guesses have been used up.

rici
  • 234,347
  • 28
  • 237
  • 341
  • 1
    On your first caveat: given that we get a three-way result each time (higher / lower / equal), and that the first comparison is with 0.0, wouldn't establishing that the number is +/- 0 happen on the first guess, so that a result of either +0.0 or -0.0 could happen on the second guess? – Mark Dickinson Jul 09 '17 at 07:45
  • Awesome! I even see very easily how this algorithm is not only terminating, but has optimal worst case runtime. Could you add a link to the background for the preservation of comparison under reinterpretation as `int64_t`, please? – Matthias Jul 09 '17 at 07:55
  • I implemented your idea in Python. Works a charm. I can guess every float in 64 steps. (And have some examples that fail in 63 steps.) – Matthias Jul 10 '17 at 20:54
0

The same approach can be applied to a floating point number. Worse case run time is O(log n).

public class GuessComparer
{
    private float random;
    public GuessComparer() // generate a random float and keep it private
    {
        Random rnd = new Random();
        var buffer = new byte[4];
        rnd.NextBytes(buffer);
        random = BitConverter.ToSingle(buffer, 0);
    }
    public int CheckGuess(float quess) // answer whether number is high, lower or the same.
    {
        return random.CompareTo(quess);
    }
}
public class FloatFinder
{

    public static int Find(GuessComparer checker)
    {
        float guess = 0;
        int result = checker.CheckGuess(guess);
        int guesscount = 1;
        var high = float.MaxValue;
        var low = float.MinValue;
        while (result != 0)
        {
            if (result > 0) //random is higher than guess
                low = guess;
            else// random is lower than guess

                high = guess;

            guess = (high + low) / 2;
            guesscount++;
            result = checker.CheckGuess(guess);
        }
        Console.WriteLine("Found answer in {0}", guesscount);
        return guesscount;
    }

    public static void Find()
    {
        var checker = new GuessComparer();
        int guesses = Find(checker);
    }
}
Alexander Higgins
  • 6,765
  • 1
  • 23
  • 41
  • 1
    Thanks. I am surprised this can deal with all the nastiness around overflow and underflow etc? I don't think this handles infinities right? I doubt this is optimal, since it doesn't know anything about float density (ie it doesn't take into account that there are more floats between 0 and 1 than between 1,000,000 and 1,000,001. – Matthias Jul 09 '17 at 00:09
  • 1
    It doesn't need to. As long as the starting high is the max value for the data type (3.40282347E+38 for a single) and starting low is min value (-3.40282347E+38) taking the average between high and low on each loop will make the algorithm converge just as it does with integers. There is no possible way to run into Positive Infinity (1.0F / 0.0F), Negative Infinity (-1.0F / 0.0F) or NaN (0.0F / 0.0F). – Alexander Higgins Jul 09 '17 at 00:31
  • 1
    What if the number in question to guess _is_ infinity? Or what if the number in question is just below float.MaxValue---won't your calculation of (high + low)/2 overflow close to the end when 'low' is also very big? – Matthias Jul 09 '17 at 07:52
  • 2
    The maximum value of a single is positive infinity, not 3.40282347E+38. – Patricia Shanahan Jul 09 '17 at 18:46