8

I was scrolling through a lot of list vs. range related questions on StackOverflow when I came across in one of the answers that look up in a range object is a CONSTANT TIME operation! It doesn't make sense how this can be done in O(1), you're going to have to iterate through the range to see if a number exists. Does the range object work like some hashmap/dictionary?

def usingRange(x):
    return x in range(0, 100000000)
print(usingRange(9999999)) 


def noRange(x):
    return x in list(range(0, 100000000))
print(noRange(9999999))

%timeit usingRange(9999999)
%timeit noRange(9999999)

I got the outputs as:

True
True
443 ns ± 8.83 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
6.89 s ± 380 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

The reason why I want to know why its constant time is because

  1. I was told not to focus so much on learning "tricks" in Python for Programming Interviews! Because everything you do, you will have TO explain the time complexity, so you either memorize that i in list1 is O(N) or you iterate using a for loop over the list:

    for j in range(len(list1)):
        if list1[j] == i:
            print(True)
    
    else:
        print(False)
    

and then come to the conclusion that it is, in fact, O(N) because to check if an element existed you go have to iterate over each element of the list till the end of the list (worst case). But some constant time operations seem less obvious! I can live with the fact that a loop in a hash table is O(1) because there is a big concept called Hashing (not gone over it yet), but not sure how it works for range objects.

  1. I was told that it is paramount to know the basics! And do your best to explain every line of what you're writing and why you're writing it in whiteboards. Someone I know was literally ASKED to implement a HASHMAP for an SWE position at a tech VERY FAMOUS company.

  2. I have a LOT of time (1 year) to prepare and be inquisitive about stuff.

ababuji
  • 1,683
  • 2
  • 14
  • 39
  • Because in second one, you also take time to convert range into a list? – Marcus.Aurelianus Jul 12 '18 at 06:58
  • 4
    *“It doesn't make sense how this can be done in O(1), you're going to have to iterate through the range to see if a number exists.”* If I asked you whether 50 was between 0 and 100, would you have to count to 100 to find out? – Ry- Jul 12 '18 at 06:58
  • @Ry- Unless it works like a hashmap! Now, does it work like a hashmap? – ababuji Jul 12 '18 at 06:59
  • @Abhishek: Would you have to make a mental hashmap to find out? No, you’d compare 0 < 50 and 50 < 100. That’s what `range` does. `list` implements `in` as iteration, `set`/`dict` implement with hashing, `range` uses math. Anyway, there’s a good duplicate link. – Ry- Jul 12 '18 at 07:00
  • Are you saying that if we asked you whether 50 was between 0 and 100, you'd make a hash table in your head, hash all the values from 0 to 100 to store them, then hash 50? – abarnert Jul 12 '18 at 07:00
  • `val in range(start, stop, step)` is identical-ish to `start <= val < stop and (val - start) % step == 0`, give or take a corner case. `<=`, `%` and `==` are all O(1). – Amadan Jul 12 '18 at 07:00
  • @abarnert I wouldn't. I was just wondering how a computer would. – ababuji Jul 12 '18 at 07:01
  • @Amadan Yes, `range` is a container. It's even a sequence. – abarnert Jul 12 '18 at 07:02
  • @abamert: Typo, as I said. and `range` is a container in Python definition of the word, but not in the common-sense definition (which seems to be tripping up OP). – Amadan Jul 12 '18 at 07:03
  • A computer could do it the exact same way you do. Python before 3.2 didn't, but since then, it does. – abarnert Jul 12 '18 at 07:03
  • @Amadan In what sense is it not a container? It contains numbers in that you can ask it whether it has a given number, or iterate over all the numbers it has, or even index or slice them. – abarnert Jul 12 '18 at 07:04
  • @abarnert In the sense that it doesn't allocate memory for elements, and thus does not actually "contain" anything. It is completely defined by its three parameters, unlike true containers like `dict`, `list`, `tuple`... Also, I just checked, Python docs doesn't call ranges containers, either. – Amadan Jul 12 '18 at 07:08
  • @abarnert Actually it is not quite as simple as less/greater comparisons. You still have to take `step` into account. But the idea is the same: `x is in range(start, end, step) if start <= x < end and (x-start) % step == 0`. Still `O(1)` – user2390182 Jul 12 '18 at 07:08
  • @Amadan The docs don't call ranges containers because they call them _sequences_, and all sequences are containers. See [Sequence Types — list, tuple, range](https://docs.python.org/3/library/stdtypes.html#sequence-types-list-tuple-range). `range` is also used as the paradigm example of a "lazy" or "virtual" sequence, which contains all of its elements without having to store them. – abarnert Jul 12 '18 at 07:13
  • @schwobaseggl Well, if you want to get technical (in which case you should see the linked dup, which gets into all the gory details), it's not really O(1), but O(log N) on the size of the integers involved, because arithmetic—and, in the worst case, even comparisons—has to iterate over the 30-bit digits. – abarnert Jul 12 '18 at 07:16
  • @abarnert Please show me the sentence "all sequences are containers" in docs. It is implied (by the "Python supports a concept of iteration over containers"), but never said. What is explicitly said though is in [Collections](https://docs.python.org/3/library/collections.html) page: "This module implements specialized container datatypes providing alternatives to Python’s general purpose built-in containers, `dict`, `list`, `set`, and `tuple`." You can see a conspicuous item missing. It would be so much easier if they didn't mix up "container" and "iterable", but Python terminology is awful. – Amadan Jul 12 '18 at 07:20
  • @Amadan Look at [`collections.abc`](https://docs.python.org/3/library/collections.abc.html#collections-abstract-base-classes). A `Sequence` is a `Reversible` `Collection` plus indexing. A `Collection` is a `Sized` `Iterable` `Container`. Therefore, a `Sequence` is a `Container`. Or look at [Emulating Container Types](https://docs.python.org/3/reference/datamodel.html#emulating-container-types) in the Data Model, which covers sequence types as a kind of container types. – abarnert Jul 12 '18 at 07:32
  • @Amadan As for the `collections` module: it doesn't provide any alternatives to `range`, so it doesn't say it provides alternatives to `range`. It also doesn't provide alternatives to `bytearray` or `str`; that doesn't mean they aren't containers, does it? – abarnert Jul 12 '18 at 07:36
  • @abarnert: Okay, sure. Thanks for letting me know it's an actual class name, not just a loose term. I still maintain that it is the disconnect between the ordinary meaning of the word "container" (there is something inside it) and the word "container" as used in Python (apparently, implements the `__contains__` contract), and it is this disconnect that caused this question to be asked. – Amadan Jul 12 '18 at 07:36
  • @Amadan I don't think most ordinary people would think of "contains" that way. Does a Python 3.4+ string not contain Unicode characters if it's actually stored in ASCII or UCS-2 and only constructs the 32-bit codepoints on demand? If I `|` the bits 2, 8, and 64 into a bitset object whose storage is actually an int32, not 32 separate bits, doesn't it still contain 64? Does a numpy array not contain floats because what it actually stores is a view over some object possibly made of strides from some other object full of native doubles and only creates float objects on demand as you index it? – abarnert Jul 12 '18 at 07:42
  • @Amadan The fact that `range` is a lazy container doesn't mean it's not a container, it just means it's lazy. – abarnert Jul 12 '18 at 07:42
  • Anyway, I think the main reason this question keeps coming up is the hundreds of SO answers, tutorials, blog posts, teachers, etc. that incorrectly say that `range` (or 2.x `xrange`) is an iterator or a generator. When you learn that it isn't, unless you learn it by reading the docs, you probably wouldn't guess that it's a sequence. At which point you probably wouldn't expect it to implement a fast `__contains__`. After all, the Python core devs didn't even think about that until they were cleaning up the `collections.abc` stuff, so I'm not surprised few novices do… – abarnert Jul 12 '18 at 07:46
  • 2
    By the way, as a general comment: What's really important here is getting a feel for what O(1), O(log N), and O(N) algorithms look like, and what data structures support them. You can get O(1) from a hash table, but also (even more simply) from a closed-form equation. Even without that, surely the values are sorted, so you could at least search in O(log N) time by bisecting, right? This kind of thinking is what most interviewers are looking for (only most; there are a few companies, including a really famous one, that seems to literally want to test how much of Knuth you've memorized…). – abarnert Jul 12 '18 at 07:51
  • 2
    In other words, don't _memorize_ that `i in list1` is O(N); instead, learn _why_ it's O(N), and how to apply that same logic to similar problems. – abarnert Jul 12 '18 at 07:56
  • Thank you @abernert. I will google (Stack search) each and every damn thing from now on. – ababuji Jul 12 '18 at 07:56
  • @abarnert It's really immaterial, but... For one thing, most of your examples are opaque to "most ordinary people". Opposite of Python, I don't collapse common-sense "contains" and "container" definitions, like "reports" (which anyone can do) vs "reporter". I can see how a string can contain a character yet I would not typically call string a container. I would say the bitset has 6th bit (or bit 64) set, not that it contains 64. And I do tend to think of numpy arrays as containers, which is a big source of pain and a reason I am not as proficient at numpy yet, as it keeps tripping me up. :P – Amadan Jul 12 '18 at 07:56
  • Regardless... it's pretty off-topic by now. Good advice to OP. Understand the structure, then you can deduce what it's good for and how to use it. – Amadan Jul 12 '18 at 07:57
  • Please don't move these comments to chat or anything or end up deleting them. I will read these comments and understand them more thoroughly when I begin to advance in Python. – ababuji Jul 12 '18 at 07:58

0 Answers0