387

I know how to use both for loops and if statements on separate lines, such as:

>>> a = [2,3,4,5,6,7,8,9,0]
... xyz = [0,12,4,6,242,7,9]
... for x in xyz:
...     if x in a:
...         print(x)
0,4,6,7,9

And I know I can use a list comprehension to combine these when the statements are simple, such as:

print([x for x in xyz if x in a])

But what I can't find is a good example anywhere (to copy and learn from) demonstrating a complex set of commands (not just "print x") that occur following a combination of a for loop and some if statements. Something that I would expect looks like:

for x in xyz if x not in a:
    print(x...)

Is this just not the way python is supposed to work?

Mateen Ulhaq
  • 24,552
  • 19
  • 101
  • 135
ChewyChunks
  • 4,449
  • 3
  • 22
  • 14
  • 36
    That's how it is... don't overcomplicate things by trying to simplify them. *Pythonic* does not mean to avoid every explicit `for` loop and `if` statement. – Felix Kling Aug 08 '11 at 11:57
  • 2
    You can use the list generated in your list comprehension in a for loop. That would somewhat look like your last example. – Jacob Aug 08 '11 at 12:01
  • So getting down to processing, what's the fastest way to combine a for loop with an if statement, if the if statement is excluding values that have already been matched and the list is continually growing during the for loop's iteration? – ChewyChunks Aug 08 '11 at 12:06
  • 3
    @Chewy, proper data structures will make the code faster, not syntactic sugar. For example, `x in a` is slow if `a` is a list. – Nick Dandoulakis Aug 08 '11 at 12:10
  • in my case, `a` is a dictionary. Is there something faster than both lists and dictionaries to search through? – ChewyChunks Aug 08 '11 at 12:16
  • @ChewyChunks: If `a` is a dictionary, can you give us a more concrete example with which to work? – johnsyweb Aug 08 '11 at 12:36
  • Sorry, my bad. `a` is a list of dictionaries, so Nick is right - processing slows down as the loop runs and `a` grows. (I have a time tracker report back every 1000 cycles or so) – ChewyChunks Aug 08 '11 at 15:24
  • 2
    This is Python, an interpreted language; why is anyone discussing how fast code is at all? – ArtOfWarfare Oct 08 '13 at 19:57
  • 1
    @ArtOfWarfare maybe because it is being used in places where it shouldn't. Where speed really matters. – Siavoshkc Jan 11 '18 at 15:03
  • 1
    Never mind the efficiency - the ugly doubling of indentation is reason enough for being able to express the loop in a single line. – beldaz Oct 09 '22 at 21:36
  • Pretty convinced if py _did_ allow @ChewyChunks' ideal `for x in xyz if x not in a:`, people who here write it's unnecessary and not pythonic, would use the feature to point out how great and smart py is. After all, this type of syntax is also ultra convenient in py's list and dict comprehension. – FlorianH Jan 01 '23 at 22:50

12 Answers12

444

You can use generator expressions like this:

gen = (x for x in xyz if x not in a)

for x in gen:
    print(x)
Ski3r3n
  • 19
  • 7
Kugel
  • 19,354
  • 16
  • 71
  • 103
  • Agreed. This looks exactly as I think it ought to look, and I understand how it will speed up and clean up my code. – ChewyChunks Aug 08 '11 at 12:29
  • Sorry another question: if I create a generator with enumerate, will it still work: – ChewyChunks Aug 08 '11 at 12:37
  • 1
    `gen = (y for (x,y) in enumerate(xyz) if x not in a)` returns >>> `12` when I type `for x in gen: print x` -- so why the unexpected behavior with enumerate? – ChewyChunks Aug 08 '11 at 12:40
  • @ChewyChunks: Please see the last sentence of my answer! `enumerate` is giving you `((0, 0), (1, 12), (2, 4), (3, 6), (4, 242), (5, 7), (6, 9))`, `1` is the only `x` not in `a` and `12` is its partner. This is not unexpected behaviour, you're just trying to do too much in one line. – johnsyweb Aug 08 '11 at 12:56
  • 17
    Possible, but not nicer than the original for and if blocks. – Mike Graham Aug 08 '11 at 14:34
  • So @Johnsyweb it will work given `gen = (y for (x,y) in enumerate(xyz) if y not in a)` ?? – ChewyChunks Aug 08 '11 at 15:32
  • 1
    @ChewyChunks. That would work but the call to enumerate is redundant. – johnsyweb Aug 08 '11 at 20:40
  • 193
    I really miss in python being able to say `for x in xyz if x:` – bgusach Sep 10 '14 at 08:06
  • @ikaros45 for x in filter(None, xyz): (of course, you can rebind partial(filter, None) if you need it often). – Veky Jul 18 '15 at 07:19
  • @Veky that's fine for basic filtering, but when you do more complex things you have to define lambdas and the beauty is lost. – bgusach Jul 18 '15 at 09:23
  • 1
    You mean condition like "if x**2 < 5"? Yes, that might be ugly to write in one line, but what exactly is the problem with two lines? [Comprehensions/genexps have to be inline since they are _expressions_. When you're writing _statements_, as you are if you do complicated things in the suite of the loop, it's natural to use lines for them.] If you're afraid of indenting your code too much, you don't have to use 4 spaces every time. `for x in xyz:/_if x**2 < 5:/____do something` (/ is newline, _ is space). – Veky Jul 18 '15 at 10:17
  • @bgusach You can always do `for x in filter(None, xyz):`. – Rob Oct 07 '15 at 09:06
  • how should i turn it into a one-liner when having a break in my loop. for example ```result=[]; for x in [1,2,3,4,5]: if x>3: result.append(x); break;``` – Diansheng Aug 24 '17 at 07:17
  • @bgusach I miss those kinds of statements from other languages: don't know when it were removed from python. I disagree with whatever mindset the python controllers (*Guido* .. ) have on most of these items - (mostly to keep things simple .. at the expense of capabilities) – WestCoastProjects Mar 04 '18 at 21:30
  • What about `for x in xyz or ():`? – thomas.mc.work Jun 14 '18 at 11:26
  • 29
    `for x in (x for x in xyz if x not in a):` works for me, but why you shouldn't just be able to do `for x in xyz if x not in a:`, I'm not sure... – Matti Wens Sep 13 '18 at 10:20
  • Great answer thanks. But just wondering why Python is making so ugly syntax. –  Jun 09 '19 at 01:03
  • 1
    Is this supposed to work if `a` is dynamically updated within the cycle? – Dr_Zaszuś Jun 19 '19 at 19:04
  • Why is it ugly? To me by starkly declaring a `for` keyword after a variable it makes it clear early on in reading that this is potentially going to be a reasonably complicated expression for the thing being assigned (or looped through). It's possible to have a few small expressions here for each of `x`, `xyz`, `a`. The first `x` could even be used to transform each result (e.g. `x.lowercase() for x in xyz if x not in a`). By separating it out in this way it makes the overall context of creating an iterator clearer. – James Bedford Jul 06 '20 at 14:04
50

As per The Zen of Python (if you are wondering whether your code is "Pythonic", that's the place to go):

  • Beautiful is better than ugly.
  • Explicit is better than implicit.
  • Simple is better than complex.
  • Flat is better than nested.
  • Readability counts.

The Pythonic way of getting the sorted intersection of two sets is:

>>> sorted(set(a).intersection(xyz))
[0, 4, 6, 7, 9]

Or those elements that are xyz but not in a:

>>> sorted(set(xyz).difference(a))
[12, 242]

But for a more complicated loop you may want to flatten it by iterating over a well-named generator expression and/or calling out to a well-named function. Trying to fit everything on one line is rarely "Pythonic".


Update following additional comments on your question and the accepted answer

I'm not sure what you are trying to do with enumerate, but if a is a dictionary, you probably want to use the keys, like this:

>>> a = {
...     2: 'Turtle Doves',
...     3: 'French Hens',
...     4: 'Colly Birds',
...     5: 'Gold Rings',
...     6: 'Geese-a-Laying',
...     7: 'Swans-a-Swimming',
...     8: 'Maids-a-Milking',
...     9: 'Ladies Dancing',
...     0: 'Camel Books',
... }
>>>
>>> xyz = [0, 12, 4, 6, 242, 7, 9]
>>>
>>> known_things = sorted(set(a.iterkeys()).intersection(xyz))
>>> unknown_things = sorted(set(xyz).difference(a.iterkeys()))
>>>
>>> for thing in known_things:
...     print 'I know about', a[thing]
...
I know about Camel Books
I know about Colly Birds
I know about Geese-a-Laying
I know about Swans-a-Swimming
I know about Ladies Dancing
>>> print '...but...'
...but...
>>>
>>> for thing in unknown_things:
...     print "I don't know what happened on the {0}th day of Christmas".format(thing)
...
I don't know what happened on the 12th day of Christmas
I don't know what happened on the 242th day of Christmas
johnsyweb
  • 136,902
  • 23
  • 188
  • 247
  • Sounds like from the comments below, I should be studying up on generators. I've never used them. Thanks. Is a generator faster than the equivalent combination of FOR and IF statements? I've also used sets, but sometimes redundant elements in a list are information I can't discard. – ChewyChunks Aug 08 '11 at 12:18
  • @ChewyChunks: Generators are not the only way to be Pythonic! – johnsyweb Aug 08 '11 at 12:34
  • 5
    @Johnsyweb, if you're going to quote the Zen of Python: "There should be one-- and preferably only one --obvious way to do it." – Wooble Aug 08 '11 at 15:11
  • @Wooble: There should. I quoted that section in [my answer to another question](http://stackoverflow.com/questions/6981495/how-can-i-concatenate-a-string-and-a-number-in-python/6981521#6981521) around the same time! – johnsyweb Aug 08 '11 at 22:01
  • the python language fails on three counts of the zen of python: and I disagree with the other three (explicit, simple, flat). I'm no newbie: it has been my primary language for 30 months and I did major project[s] using it every year since 2012 . Is this comment off topic? Given the zen was put in relief in the question not necessarily – WestCoastProjects Oct 31 '21 at 19:31
  • @Wooble Except if you're Dutch – acidjunk Apr 22 '22 at 00:47
39

The following is a simplification/one liner from the accepted answer:

a = [2,3,4,5,6,7,8,9,0]
xyz = [0,12,4,6,242,7,9]

for x in (x for x in xyz if x not in a):
    print(x)

12
242

Notice that the generator was kept inline. This was tested on python2.7 and python3.6 (notice the parens in the print ;) )

It is honestly cumbersome even so: the x is mentioned four times.

WestCoastProjects
  • 58,982
  • 91
  • 316
  • 560
25

I personally think this is the prettiest version:

a = [2,3,4,5,6,7,8,9,0]
xyz = [0,12,4,6,242,7,9]
for x in filter(lambda w: w in a, xyz):
  print x

Edit

if you are very keen on avoiding to use lambda you can use partial function application and use the operator module (that provides functions of most operators).

https://docs.python.org/2/library/operator.html#module-operator

from operator import contains
from functools import partial
print(list(filter(partial(contains, a), xyz)))
Alexander Oh
  • 24,223
  • 14
  • 73
  • 76
  • 9
    `filter(a.__contains__, xyz)`. Usually when people use lambda, they really need something much simpler. – Veky Jul 18 '15 at 07:20
  • I think you misunderstood something. `__contains__` is a method like any other, only it is a _special_ method, meaning it can be called indirectly by an operator (`in` in this case). But it can also be called directly, it is a part of the public API. Private names are specifically defined as having at most one trailing underscore, to provide exception for special method names - and they are subject to name mangling when lexically in class scopes. See https://docs.python.org/3/reference/datamodel.html#specialnames and https://docs.python.org/3.6/tutorial/classes.html#private-variables . – Veky Jan 06 '16 at 13:29
  • It is certainly ok, but two imports just to be able to refer to a method that's accessible using just an attribute seems weird (operators are usually used when double dispatch is essential, but `in` is singly dispatched wrt right operand). Besides, note that `operator` also exports `contains` method under the name `__contains__`, so it surely is not a private name. I think you'll just have to learn to live with the fact that not every double underscore means "keep away". :-] – Veky Jan 08 '16 at 14:14
  • I think your `lambda` needs fixing to include `not` : `lambda w: not w in a, xyz` – WestCoastProjects Aug 08 '19 at 16:09
  • 1
    The filter seems more elegant, especially for complex conditions that would become defined functions instead of lambdas, maybe naming the lambda function would add some readability, The generator seems better when the iterated elements are some modification on the list items – Khanis Rok Sep 23 '19 at 21:33
  • better solution to lambda or `partial(contains, a)` just define a one line function above: `def isKnown(x): return x in a;` This makes the loop read beautifully: `for x in filter(isKnown, xyz): pass` – Tadhg McDonald-Jensen Feb 29 '20 at 23:01
  • 1
    @TadhgMcDonald-Jensen split your code into functions however you like, this is just example code. Bear in mind that it depends on the complexity of the passed function, whether it's worth introducing an extra symbol vs a provided one. This depends on whether you have for instance documentation guarantees that you need, and unit test coverage. – Alexander Oh Mar 02 '20 at 12:01
18

I would probably use:

for x in xyz: 
    if x not in a:
        print(x...)
Wim Feijen
  • 788
  • 7
  • 9
  • @KirillTitov Yes python is a fundamentally non-functional language (this is a purely imperative coding - and I agree with this answer's author that it is the way python is set up to be written. Attempting to use functionals leads to poorly reading or non-`pythonic` results. I can code functionally in every other language I use (scala, kotlin, javascript, R, swift, ..) but difficult/awkward in python – WestCoastProjects May 30 '20 at 16:20
9
a = [2,3,4,5,6,7,8,9,0]
xyz = [0,12,4,6,242,7,9]  
set(a) & set(xyz)  
set([0, 9, 4, 6, 7])
sloth
  • 99,095
  • 21
  • 171
  • 219
Kracekumar
  • 19,457
  • 10
  • 47
  • 56
  • Very Zen, @lazyr, but would not help me improve a complex code block that depends on iterating through one list and ignoring matching elements in another list. Is it faster to treat the first list as a set and compare union / difference with a second, growing "ignore" list? – ChewyChunks Aug 08 '11 at 12:22
  • Try this `import time a = [2,3,4,5,6,7,8,9,0] xyz = [0,12,4,6,242,7,9] start = time.time() print (set(a) & set(xyz)) print time.time() - start` – Kracekumar Aug 08 '11 at 12:31
  • @ChewyChunks if either of the lists change during the iteration it will probably be faster to check each element against the ignore list -- except you should make it an ignore set. Checking for membership in sets is very fast: `if x in ignore: ...`. – Lauritz V. Thaulow Aug 08 '11 at 12:42
  • @lazyr I just rewrote my code using an **ignore set** over an ignore list. Appears to process time much slower. (To be fair I was comparing using `if set(a) - set(ignore) == set([]):` so perhaps that's why it was much slower than checking membership. I'll test this again in the future on a much simpler example than what I'm writing. – ChewyChunks Aug 08 '11 at 15:29
7

I liked Alex's answer, because a filter is exactly an if applied to a list, so if you want to explore a subset of a list given a condition, this seems to be the most natural way

mylist = [1,2,3,4,5]
another_list = [2,3,4]

wanted = lambda x:x in another_list

for x in filter(wanted, mylist):
    print(x)

this method is useful for the separation of concerns, if the condition function changes, the only code to fiddle with is the function itself

mylist = [1,2,3,4,5]

wanted = lambda x:(x**0.5) > 10**0.3

for x in filter(wanted, mylist):
    print(x)

The generator method seems better when you don't want members of the list, but a modification of said members, which seems more fit to a generator

mylist = [1,2,3,4,5]

wanted = lambda x:(x**0.5) > 10**0.3

generator = (x**0.5 for x in mylist if wanted(x))

for x in generator:
    print(x)

Also, filters work with generators, although in this case it isn't efficient

mylist = [1,2,3,4,5]

wanted = lambda x:(x**0.5) > 10**0.3

generator = (x**0.9 for x in mylist)

for x in filter(wanted, generator):
    print(x)

But of course, it would still be nice to write like this:

mylist = [1,2,3,4,5]

wanted = lambda x:(x**0.5) > 10**0.3

# for x in filter(wanted, mylist):
for x in mylist if wanted(x):
    print(x)
Khanis Rok
  • 617
  • 6
  • 12
6

You can use generators too, if generator expressions become too involved or complex:

def gen():
    for x in xyz:
        if x in a:
            yield x

for x in gen():
    print x
Lauritz V. Thaulow
  • 49,139
  • 12
  • 73
  • 92
  • This is a bit more useful to me. I've never looked at generators. They sound scary (because I saw them in modules that were generally a pain to use). – ChewyChunks Aug 08 '11 at 12:10
2

Use intersection or intersection_update

  • intersection :

    a = [2,3,4,5,6,7,8,9,0]
    xyz = [0,12,4,6,242,7,9]
    ans = sorted(set(a).intersection(set(xyz)))
    
  • intersection_update:

    a = [2,3,4,5,6,7,8,9,0]
    xyz = [0,12,4,6,242,7,9]
    b = set(a)
    b.intersection_update(xyz)
    

    then b is your answer

Chung-Yen Hung
  • 329
  • 1
  • 6
2

based on the article here: https://towardsdatascience.com/a-comprehensive-hands-on-guide-to-transfer-learning-with-real-world-applications-in-deep-learning-212bf3b2f27a I used the following code for the same reason and it worked just fine:

an_array = [x for x in xyz if x not in a]

This line is a part of the program! this means that XYZ is an array which is to be defined and assigned previously, and also the variable a

Using generator expressions (which is recommended in the selected answer) makes some difficulties because the result is not an array

1

A simple way to find unique common elements of lists a and b:

a = [1,2,3]
b = [3,6,2]
for both in set(a) & set(b):
    print(both)
peawormsworth
  • 1,120
  • 9
  • 9
0

Well, it's possible to do it in just one line.

a = [2,3,4,5,6,7,8,9,0]
xyz = [0,12,4,6,242,7,9]
print('\n'.join([str(x) for x in xyz if x not in a]))
-----------------------------------------------------------------------
12
242
This gives you: