Find tuple structure containing an unknown value inside a list

Question

Say I have list of tuples:

list = [(1,5), (1,7), (2,3)]

Is there a way in Python to write something like

if (1, *) in list: do things

where * means "I don’t care about this value"? So we are checking if there is a tuple with 1 at the first position and with whatever value on the second one.

As far as I know there are special mechanisms in other languages, but I just don’t know the name of this particular problem. So is there similar behavior in Python?

P.S.: I know that I can use list comprehensions here. I am just interested in this particular mechanism.

Side note: don't name your variable `list`. This will clash with the `list` function and data type from the Python standard library. — Eddo Hintoso, Mar 01 '16 at 10:04
Related: [How Python's any and all functions work?](http://stackoverflow.com/q/19389490) — Nick Volynkin, Mar 01 '16 at 11:56
Finally realized what this makes me think of: databases. You are effectively wanting something along the lines of `SELECT * IN collection WHERE column1 = 1` — Justin, Mar 02 '16 at 01:59
@AaronHall I'm here not for upvotes but for the answer and to be honest in my code I used list comprehension (without 'any' function). I selected answer because it's answers the question "Is there native placeholder mechanism in python?". — selotec, Mar 02 '16 at 03:45
@Justin: not really. They are testing any such rows *exist*, they are not selecting them. — Martijn Pieters, Mar 02 '16 at 10:36
@Justin: but you want to avoid retrieving all matching rows, when all you need to know the *existence* of a row. — Martijn Pieters, Mar 02 '16 at 13:51
@MartijnPieters Yes of course. I was just pointing out a similarity between this question and database operations. — Justin, Mar 02 '16 at 13:52
@Justin: sure, but then wrap that SELECT in a `WHERE EXISTS` clause. — Martijn Pieters, Mar 02 '16 at 13:53
The name of the language feature is QbE (query by example), unification (Prolog), or simply "pattern matching". — jhermann, Mar 04 '16 at 18:43

score 71 · Answer 1 · edited Mar 01 '16 at 10:09

71

You can use the any() function:

if any(t[0] == 1 for t in yourlist):

This efficiently tests and exits early if 1 is found in the first position of a tuple.

edited Mar 01 '16 at 10:09

Holloway

6,412
1
26
33

answered Mar 01 '16 at 09:53

Martijn Pieters

1,048,767
296
4,058
3,343

3

For what it's worth, I down-voted this answer, despite the fact that this is how I would solve OP's posted problem. However, considering the title of the thread, I don't think this answer (or any other current ones apart from my own) actually say what the OP wants to know (namely how to get a variable with "any" value). – acdr Mar 01 '16 at 10:44
13

@acdr: that's a different interpretation of the question. I'm sorry you don't find my answer helpful or enlightening, but this is a correct way to solve the underlying problem. – Martijn Pieters Mar 01 '16 at 10:46
1

It's a good solution, thanks, but a question was a slightly different. – selotec Mar 01 '16 at 10:50
@acdr: see, I don't think there was any need to down vote just because multiple interpretations are possible. Leave it to the OP to decide what interpretation they feel fits their question. – Martijn Pieters Mar 01 '16 at 10:50
1

The difference between this and OP is that this approach will iterate over `yourlist`. This is okay when `yourlist` is actually a `list` but when it is for example a 2d-tree, things will be more expensive. – WorldSEnder Mar 01 '16 at 14:14
3

@WorldSEnder: alternative data structures are entirely out of scope here; the OP did post a list in their question. – Martijn Pieters Mar 01 '16 at 14:15
1

@WorldSEnder I doubt a 2d-tree would be able to cope with an ANY key either, such as the one in the accepted answer.. – John Dvorak Mar 01 '16 at 15:56
actually, _both_ approaches iterate over `yourlist`; it's just that this one does it explicitly, and the other does it implicitly. because it's explicit, this one will iterate even when the source container does not normally iterate for `__contains__` (e.g. hash lookup or other shortcut). – Corley Brigman Mar 01 '16 at 16:56
The OP should edit the Q, since its title directly clashes with `if (1, *) in list: do things` given in the Q's body. This answer is a 100% fit for the *code*. – jhermann Mar 04 '16 at 18:33

acdr · Accepted Answer · 2016-03-01T10:55:48.170

33

A placeholder object like you're asking for isn't supported natively, but you can make something like that yourself:

class Any(object):
    def __eq__(self, other):
        return True
ANYTHING = Any()

lst = [(1,5), (1,7), (2,3)]

The __eq__ method defines how two objects test for equality. (See https://docs.python.org/3/reference/datamodel.html for details.) Here, ANYTHING will always test positive for equality with any object. (Unless that object also overrode __eq__ in a way to return False.)

The in operator merely calls __eq__ for each element in your list. I.e. a in b does something like:

for elem in b:
    if elem == a:
        return True

This means that, if you say (1, ANYTHING) in lst, Python will first compare (1, ANYTHING) to the first element in lst. (Tuples, in turn, define __eq__ to return True if all its elements' __eq__ return True. I.e. (x, y) == (a, b) is equivalent to x==a and y==b, or x.__eq__(a) and y.__eq__(b).)

Hence, (1, ANYTHING) in lst will return True, while (3, ANYTHING) in lst will return False.

Also, note that I renamed your list lst instead of list to prevent name clashes with the Python built-in list.

edited Mar 01 '16 at 10:55

answered Mar 01 '16 at 10:18

acdr

4,538
2
19
45

10

Pre-empting other comments: technically when doing trickery that involves redefining `__eq__`, one should also redefine `__ne__`, but that's not needed in this case. Also note that `ANYTHING==x` will return True regardless of `x`'s value, but the same is not necessarily true of `x==ANYTHING`. – acdr Mar 01 '16 at 10:23
1

There are other answers which talk about built in 'any' , Why do you think this isn't supported natively? – AlokThakur Mar 01 '16 at 10:27
2

@AlokThakur: There's two different functionalities here that sound very similar. It is indeed natively supported to check if any element of an iterable is True, but the OP didn't sound like that's what he wanted to know. The second issue, which is far more interesting, is having an object (`*` in OP's post, `ANYTHING` in mine) which is a placeholder for "any value", which is unrelated to the functionality provided by the `any` built-in. (That said, I edited my answer to clarify what isn't natively supported.) – acdr Mar 01 '16 at 10:31
That's the answer I was looking for. But could you clarify what actually happens in redefining __eq__? It's not clear for me how this goes from class Any to (1, ANYTHING) in lst example. Does it means that in class Any every comparison will return True? – selotec Mar 01 '16 at 10:47
@selotec I added a bit of explanation to the answer, but in short: yes, any instance of the `Any` class will (almost) always test `True` for equality tests. (Not quite always: imagine if I also made a `Nothing` class, and defined its `__eq__` to always return False!) – acdr Mar 01 '16 at 10:59
@acdr Based on which comes first will not `NOTHING == ANYTHING` be false, while `ANYTHING == NOTHING` is true? Hmmm... Interesting. – holroy Mar 01 '16 at 14:08
@holroy That's indeed exactly how it works! `NOTHING == ANYTHING` will call `NOTHING.__eq__(ANYTHING)`, whereas `ANYTHING == NOTHING` will call `ANYTHING.__eq__(NOTHING)` – acdr Mar 01 '16 at 14:33
Are you 100% sure that `in` will directly call `__eq__`? I believe there are situations where *before* calling `__eq__` the interpreter will simply check the memory addresses. This would make `x = (1, NOTHING); x in [x]` return `True` even if you expected it to return `False` due to how `NOTHING.__eq__` behaves... However I'm not sure if `in` is one of these. – Bakuriu Mar 01 '16 at 14:52
@Bakuriu Looks like you're right. From my NOTHING example above, `NOTHING in [NOTHING]` still returns true. I'd guess this is implemented under the hood as something like `(x is y) or (x.__eq__(y))` rather than just `x.__eq__(y)` but don't quote me on that. A quick google search doesn't teach me the exact workings of the `in` operator. – acdr Mar 01 '16 at 15:05
You can look at the sources on [`hg.python.org`](https://hg.python.org/). The `in` operator is compiled to a `COMPARE_OP` bytecode and you can then go look in the [`Python/ceval.c`](https://hg.python.org/cpython/file/tip/Python/ceval.c#l2707) file which contains the interpreter main loop, and you see that comparisons are done using [`cmp_outcome`](https://hg.python.org/cpython/file/tip/Python/ceval.c#l5088) the `in` case calls [`PySequence_Contains`](https://hg.python.org/cpython/file/tip/Objects/abstract.c#l1915) – Bakuriu Mar 01 '16 at 15:47
which calls [`_PySequence_IterSearch`](https://hg.python.org/cpython/file/tip/Objects/abstract.c#l1915) which simply calls `PyIter_Next` and uses [`PyObject_RichCompareBool`](https://hg.python.org/cpython/file/tip/Objects/object.c#l723) which finally uses memory address comparison. In fact there is a comment about forcing identity implying equality. – Bakuriu Mar 01 '16 at 15:49
10

Be VERY CAREFUL using solutions like this... the given solution will work for a list, but will NOT work for something like a dictionary: `d = { (1,1): 'A' }; (1, ANYTHING) in d` return `False`. This is because in this case, it's actually computing the hash of the source and looking up directly. Basically, anytime the object's lookup is O(1), it won't work without explicitly converting it to an O(n) lookup (e.g. `(1, ANYTHING) in d.keys()` does return `True`...). Just make sure you don't blindly use this... – Corley Brigman Mar 01 '16 at 16:53
@CorleyBrigman Actually using `.keys()` doesn't help at all if you are using non-outdated versions of python, since it returns a view (which is basically a `set`). Use `x in list(d.keys())` instead... – Bakuriu Mar 01 '16 at 18:04
good point... i'm still on python 2.7, where `keys()` is still a list. – Corley Brigman Mar 01 '16 at 18:13
6

If I saw code like this in a review I would kill it with fire and insist it be done as Martijn suggested. – Russia Must Remove Putin Mar 01 '16 at 21:57

Eddo Hintoso · Answer 3 · 2016-03-01T10:57:49.860

11

Not all of my solution methods provided below will be necessarily efficient. My goal is to demonstrate every possible solution method I can think of - at the end of my answer I provide "benchmark" results to show why or why not you should use one certain method over another. I believe that is a good way of learning, and I will shamelessly encourage such learning in my answers.

Subset + hash `set`s

>>> a_list = [(1,5), (1,7), (2,3)]
>>>
>>> set([l[0] for l in a_list])
{1, 2}
>>>
>>> 1 in set([l[0] for l in a_list])
True

`map()`, and anonymous functions

>>> a_list = [(1,5), (1,7), (2,3)]
>>>
>>> map(lambda x: x[0] == 1, a_list)
[True, True, False]
>>>
>>> True in set(map(lambda x: x[0] == 1, a_list))
True

`filter` and anonymous functions

>>> a_list = [(1,5), (1,7), (2,3)]
>>>
>>> filter(lambda x: x[0] == 1, a_list)
[(1,5), (1,7)]
>>>
>>> len(filter(lambda x: x[0] == 1, a_list)) > 0  # non-empty list
True

MICROBENCHMARKS

Conditions

1000 items
100K repetition
0-100 random range
Python 2.7.10, IPython 2.3.0

Script

from pprint import pprint
from random import randint
from timeit import timeit

N_ITEMS = 1000
N_SIM = 1 * (10 ** 5)  # 100K = 100000

a_list = [(randint(0, 100), randint(0, 100)) for _ in range(N_ITEMS)]

set_membership_list_comprehension_time = timeit(
    "1 in set([l[0] for l in a_list])",
    number = N_SIM,
    setup="from __main__ import a_list"
)

bool_membership_map_time = timeit(
    "True in set(map(lambda x: x[0] == 1, a_list))",
    number = N_SIM,
    setup="from __main__ import a_list"
)

nonzero_length_filter_time = timeit(
    "len(filter(lambda x: x[0] == 1, a_list)) > 0",
    number = N_SIM,
    setup="from __main__ import a_list"
)

any_list_comprehension_time = timeit(
    "any(t[0] == 1 for t in a_list)",
    number = N_SIM,
    setup="from __main__ import a_list"
)

results = {
    "any(t[0] == 1 for t in a_list)": any_list_comprehension_time,
    "len(filter(lambda x: x[0] == 1, a_list)) > 0": nonzero_length_filter_time,
    "True in set(map(lambda x: x[0] == 1, a_list))": bool_membership_map_time,
    "1 in set([l[0] for l in a_list])": set_membership_list_comprehension_time
}

pprint(
    sorted(results.items(), key = lambda x: x[1])
)

Results (in seconds)

[('any(t[0] == 1 for t in a_list)', 2.6685791015625),     # winner - Martijn
 ('1 in set([l[0] for l in a_list])', 4.85234808921814),
 ('len(filter(lambda x: x[0] == 1, a_list)) > 0', 7.11224889755249),
 ('True in set(map(lambda x: x[0] == 1, a_list))', 10.343087911605835)]

Who's got the last laugh now? ... Martijn (at least I tried)

MORAL OF THE STORY: Don't spend more than 10 minutes "proving" your inferior solution is faster and more efficient on a small test data, when another user's answer is the de-facto correct one

edited Mar 01 '16 at 10:57

answered Mar 01 '16 at 10:13

Eddo Hintoso

1,442
14
22

The `map()` defeats the purpose of `any()`; you now created a list with all boolean outcomes, up front. That's why you normally use a *generator expression* with `any()`. – Martijn Pieters Mar 01 '16 at 10:16
You created a set from a list; you can create the set directly, again with a generator expression rather than a list comprehension. – Martijn Pieters Mar 01 '16 at 10:17
Can you retest with larger datasets too? 3 elements is hardly a good test of the methods. – Martijn Pieters Mar 01 '16 at 10:37
With `a_list = [(random.randint(0, 10), random.randint(0, 10)) for _ in range(1000)]` (shared between tests, don't regenerate), and 100k repetitions, `any()` takes 0.18 seconds on Python 3.5. The other methods still take seconds (orders of magnitude slower). That's because your `set()` method still has to process **all** tuples, while `any()` only processes as many tuples as it takes to find one example. – Martijn Pieters Mar 01 '16 at 10:41
Wow I spent time on this more than I should have. But @MartijnPieters, I concede; I have updated my script because I did close to what your parameters are. I'm glad I did this though, it taught me not to be cocky and that we can all learn a thing or two from each other. Except maybe you. Maybe you're already a know it all. I don't know. Thanks anyways. – Eddo Hintoso Mar 01 '16 at 10:51
You could also add the now accepted approach (using a sentinel object that matches anything), and add a worst case and best case test (long list with no matches and long list with the first tuple matching). – Martijn Pieters Mar 01 '16 at 10:51
I'm sorry you find my feedback so.. condescending? I could just refrain from giving you honest feedback. There really is no need to be so rude when receiving it, at any rate. – Martijn Pieters Mar 01 '16 at 10:54
Wait I didn't find your feedback condescending... I'm actually appreciative of it. We're all here to learn though and I'm all for being a helpful member of the SO community; just out of curiosity how did you infer I was threatened...? I appreciate honest feedback, so that was genuine gratitude. – Eddo Hintoso Mar 01 '16 at 10:55
1

Perhaps ask someone else to read *[..] it taught me not to be cocky and that we can all learn a thing or two from each other. Except maybe you. Maybe you're already a know it all. I don't know. Thanks anyways* and ask them how that comes over? That comment does not come over as appreciative. :-) – Martijn Pieters Mar 01 '16 at 10:57
5

@MartijnPieters: HAHAHAHA I'm sorry - I was actually trying to make it sound like I respect your expertise and knowledge, like you already know everything and that you're the teacher more often than you are the student. But in hindsight man I DO sound like an incredible asshole (pardon my language)! But no, that was my bad, it was not my intention to voice it that way. I've seen you helping others elsewhere and I appreciate what you're doing for the community. Man, I feel sort of bad right now. Apologies my good man! – Eddo Hintoso Mar 01 '16 at 11:00
Hehe, no worries, apologies accepted. And for the record: Stack Overflow taught me so much of what I know, and continues to be a source of learning. Never stop being curious! – Martijn Pieters Mar 01 '16 at 11:06
Dude, I'll be expecting to see you around more - I'm starting to increasingly spend disproportionate amount of time on SO... But I honestly don't get how some of the top contributors even manage to be so active; it's like they're constantly refreshing the Questions page or something - which I think is incredibly selfless and amazing. Anyways, SO is telling me to avoid extended discussions in comments so I'll gladly heed its advice, hehe. I'll be sure to give back to the SO community I have only taken so much from - P.S. I still feel bad about sounding belligerent, haha! – Eddo Hintoso Mar 01 '16 at 11:12
4

Kudos for this reaction. We all need to [relearn this](http://blog.codinghorror.com/the-ten-commandments-of-egoless-programming/) from time to time. – holdenweb Mar 01 '16 at 11:36

score 5 · Answer 4 · edited Mar 01 '16 at 10:11

5

This can be done in Python using list comprehension. ex:

a= [(1, 2), (3, 4), (4, 5), (1, 4)]
[i for i in a if i[0] == 1]

Will give you:

[(1, 2), (1, 4)]

edited Mar 01 '16 at 10:11

Remi Guan

21,506
17
64
87

answered Mar 01 '16 at 10:09

nsr

855
7
10

Padraic Cunningham · Answer 5 · 2016-03-01T16:17:16.693

Indexing is the simplest but if you wanted to use syntax similar to your example where you wanted to assign the first value to a variable and ignore the rest you could use python3's extended iterable unpacking.

In [3]: [a for a,*_ in l]
Out[3]: [1, 1, 2]

Or with the any logic:

In [4]: l = [(1,5), (1,7), (2,3)]

In [5]: any(a == 1 for a,*_ in l)
Out[5]: True

Or mimicking any without the function call:

In [23]: l = [(1,5), (1,7), (2,3)]
In [24]: g = (a  for a,*_ in l)

In [25]: 1 in g
Out[25]: True

In [26]: list(g)
Out[26]: [1, 2]

score 2 · Answer 6 · answered Mar 01 '16 at 10:10

2

number of element in tuple could be handled also.

>>> import operator
>>> mylist = [(1,2), (1,5), (4,5,8)]
>>> any(i==1 for i  in map(operator.itemgetter(0), mylist))
True

answered Mar 01 '16 at 10:10

Nazmul Hasan

6,840
13
36
37

score 0 · Answer 7 · answered Mar 01 '16 at 15:58

It sounds like you actually want filter(), not any():

tuple_list = [(1,5), (1,7), (2,3)]
for pair in filter(lambda pair: (pair[0] == 1), tuple_list):
    print "Second value {pair[1]} found from {pair}".format(pair=pair)
...
Second value 5 found from (1, 5)
Second value 7 found from (1, 7)

The filter() method is great because you can provide a function directly to it. This lets you specify a certain key to filter on, etc. To simplify it further, use a lambda expression to make the entire thing into a one-liner.

The purpose of my answer is to show the Python 2 filter() method specifically as a correction to the any() method; not to necessarily show the cleanest way. Also, according to post OP is already aware of list comprehensions. — dhackner, Mar 01 '16 at 22:50

Find tuple structure containing an unknown value inside a list

7 Answers7

Subset + hash sets

map(), and anonymous functions

filter and anonymous functions