6
from random import randrange
data = [(randrange(8), randrange(8)) for x in range(8)]

And we have to test if the first item equals to one of a tail. I am curious, how we would do it in most simple way without copying tail items to the new list? Please take into account this piece of code gets executed many times in, say, update() method, and therefore it has to be quick as possible.

Using an additional list (unnesessary memory wasting, i guess):

head = data[0]
result = head in data[1:]

Okay, here's another way (too lengthy):

i = 1
while i < len(data):
    result = head == data[i]
    if result:
        break
    i+=1

What is the most Pythonic way to solve this? Thanks.

varnie
  • 2,523
  • 3
  • 35
  • 42
  • 1
    What's wrong with simply saying `result = data[0] in data[1:]` ? If doing an `in` query, the list slice is simply an iterator, rather than a copy. – Aram Dulyan Oct 11 '10 at 04:17
  • 3
    oops. i didn't know that. any docs stated this? – varnie Oct 11 '10 at 04:19
  • 1
    @Aram Dulyan, What version of python? I can see my memory jump on my system moniter when I slice a large enough list (around ten million elements) in the manner you are describing. I too am going to need to see some docs to believe that. – aaronasterling Oct 11 '10 at 04:33
  • Sadly, taking a slice of a tuple causes a memory jump as well (Python 2.6). I would have thought that slices of immutable sequences would be implemented by wrapping the backing sequence. – ide Oct 11 '10 at 04:51
  • `lst = [1, 1]; it = iter(lst[1:]); lst[1] = 2; it.next()` gives `1`, again suggesting/confirming that the slice operator does create a copy of the original list for the purposes of iteration. – intuited Oct 11 '10 at 04:54
  • @intuited: What if slices were implemented using copy-on-write? – ide Oct 11 '10 at 05:04
  • @ide: then that would be super cool! I guess now somebody has to go dig up the source code. I'm guessing that they're not, because that would be nifty enough that it would be better known. – intuited Oct 11 '10 at 05:14
  • I just disassembled some code and python loads the slice indexes (1, None) with two `LOAD_CONST` s and then emits a `BUILD_SLICE` followed by a `BINARY_SUBSCR`. I don't know what the last instruction does so I'm going to have to look it up but it seems to emit identical opcodes. It's still possible that something else is happening within the opcodes but the evidence seems to be stacked pretty high against this proposition. – aaronasterling Oct 11 '10 at 05:25
  • @AaronMcSmooth I think you may be right. I've been dealing with Django QuerySets all day and might have gotten the internals of iterable objects mixed up. In any case, Nick D's answers below are completely spot on, and I'd go with one of those. – Aram Dulyan Oct 11 '10 at 05:38
  • FWIW, It looks like the relevant source code goes through `list_subscript`, which at line 2550 of the file `Objects/listobject.c` calls `list_slice`. `list_slice` allocates a new `PyListObject` and hits a for loop at line 481 of that same file. The for loop copies (a reference to) each element from the source list to the destination list, incrementing reference counts as it goes. Those line numbers are for Python 2.6.5: http://svn.python.org/view/python/tags/r265/Objects/listobject.c?view=markup – intuited Oct 11 '10 at 06:13
  • Also: Just to confirm that that code actually does get run —well, actually mostly just for the sake of digging into something with `gdb`— I ran `gdb --args /usr/bin/python -c 'lst = [1, 420]; lst[1:]' and set `break list_slice`. The second time it hit the breakpoint was to do the slicing in the command line code (the first looks to be something in `sys.path` setup). – intuited Oct 11 '10 at 07:04
  • Related: https://stackoverflow.com/questions/10532473/python-head-and-tail-in-one-line However, the answer there uses `head, *tail = data`, which will make a list copy. – Denilson Sá Maia Mar 01 '16 at 20:39

2 Answers2

8

Nick D's Answer is better

use islice. It doesn't make a copy of the list and essentially embeds your second (elegant but verbose) solution in a C module.

import itertools

head = data[0]
result = head in itertools.islice(data, 1, None)

for a demo:

>>> a = [1, 2, 3, 1]
>>> head = a[0]
>>> tail = itertools.islice(a, 1, None)
>>> head in tail
True

Note that you can only traverse it once but if all you want to do is check that the head is or is not in the tail and you're worried about memory, then I think that this is the best bet.

aaronasterling
  • 68,820
  • 20
  • 127
  • 125
  • Thank you! backing to previous comment, is it true that "data[0] in data[1:]" doesn't create new copy of list? – varnie Oct 11 '10 at 04:30
  • Cool, I didn't realize you could use `in` on iterators. It should be `itertools.islice(data, 1, None)` though — otherwise it will `StopIteration` after the 0th element. – intuited Oct 11 '10 at 04:30
  • @intuited, good looking out. I made the mistake when making the demo and edited the wrong line out. @varnie, anytime. I'm not sure. See my response to that comment. I'm dubious and want to see documentation. I've never heard of it but I'm a bit of a newb. – aaronasterling Oct 11 '10 at 04:36
  • @varnie. You should accept Nick D's answer so I can delete this one. He has better solutions. – aaronasterling Oct 11 '10 at 04:45
  • @AaronMcSmooth: Done. But in any case i found your solution being useful too. – varnie Oct 11 '10 at 05:00
  • @AaronMcSmooth: I think you should keep this answer, it's a useful alternative. I actually prefer it, because it's a one-liner (assuming you have already imported `itertools`) and doesn't waste effort by iterating past the desired value as does `count`. It's also good for people to know about because it's more generalized. – intuited Oct 11 '10 at 05:05
5

Alternative ways,

# 1
result = data.count(data[0]) > 1


# 2
it = iter(data)
result = it.next() in it
Nick Dandoulakis
  • 42,588
  • 16
  • 104
  • 136
  • A refinement of #2, based on the fact that the first "argument" to `in` must be evaluated before the test begins: `it = iter(data); it.next() in it` – senderle Jun 18 '12 at 03:36