3

This is really fast:

1 in range(100000000000000)

This is really slow:

1.5 in range(100000000000000)

Why does the full range have to be generated to know that 1.5 isn't in range(X) when step has to be an integer?

Aran-Fey
  • 39,665
  • 11
  • 104
  • 149
SuperShoot
  • 9,880
  • 2
  • 38
  • 55
  • 2
    1 in range will stop after the first element. How is the speed when you're checking an int that isn't in your range? – matt Sep 17 '18 at 10:27
  • That's fast too. I would have thought that it's immediately obvious a number isn't in a range if the modulo of the number divided by the step isn't 0. – SuperShoot Sep 17 '18 at 10:30
  • I guess optimizations are made if the element is an integer. 'hello' in range(100000000000) is slow too – Corentin Limier Sep 17 '18 at 10:38
  • @CorentinLimier I suppose it's as simple as that! `99999999.0 in range(100000000)` takes a long time but `99999999 in range(100000000)` doesn't. – SuperShoot Sep 17 '18 at 10:40
  • My guess: floating points in ranges are tricky so they exhaustively search? But it should bail early from the search after it encounters a value in the range greater than the needle (the value its looking for)? – Ahmed Fasih Sep 17 '18 at 10:47
  • @AhmedFasih so in theory it would find 2 almost immediately and bail out of the example above quickly.. but it takes a long time so must have to populate a list with the full range of values first? – SuperShoot Sep 17 '18 at 11:48
  • @SuperShoot yes I'm very confused! – Ahmed Fasih Sep 17 '18 at 13:11

1 Answers1

3

If we check the source code:

The contains function:

range_contains(rangeobject *r, PyObject *ob)
{
    if (PyLong_CheckExact(ob) || PyBool_Check(ob))
        return range_contains_long(r, ob);

    return (int)_PySequence_IterSearch((PyObject*)r, ob,
                                   PY_ITERSEARCH_CONTAINS);
}

It appears to check if it will use an integer or boolean method to check, and if not it uses the PY_ITERSEARCH_CONTAINS.

matt
  • 10,892
  • 3
  • 22
  • 34
  • Even without knowing any c it’s clear what’s going on there so +1. Any thoughts on why? Too much of an edge case? Too expensive? Just not a function intended for floats? – SuperShoot Sep 17 '18 at 11:40