7

I'm developing a Python program to detect names of cities in a list of records. The code I've developed so far is the following:

aCities = ['MELBOURNE', 'SYDNEY', 'PERTH', 'DUBAI', 'LONDON']

cxTrx = db.cursor()
cxTrx.execute( 'SELECT desc FROM AccountingRecords' )

for row in cxTrx.fetchall() :
    if any( city in row[0] for city in aCities ) :
        #print the name of the city that fired the any() function
    else :
        # no city name found in the accounting record

The code works well to detect when a city in the aCities' list is found in the accounting record but as the any() function just returns True or False I'm struggling to know which city (Melbourne, Sydney, Perth, Dubai or London) triggered the exit.

I've tried with aCities.index and queue but no success so far.

Giacomo1968
  • 25,759
  • 11
  • 71
  • 103
Luis U.
  • 2,500
  • 2
  • 17
  • 15

5 Answers5

10

I don't think it's possible with any. You can use next with default value:

for row in cxTrx.fetchall() :
    city = next((city for city in aCities if city in row[0]), None)
    if city is not None:
        #print the name of the city that fired the any() function
    else :
        # no city name found in the accounting record
falsetru
  • 357,413
  • 63
  • 732
  • 636
4

You won't because any returns only a boolean value. But you can use next:

city = next((city for city in aCities if city in row[0]), None)
if city:
   ...

With this syntax you'll find the first city that is a substring of the description stored in the database row. If there isn't one, the second parameter e.g. None, will be returned.

JuniorCompressor
  • 19,631
  • 4
  • 30
  • 57
3

No, it is possible with any. It's a bit of a stunt - it "reads funny" - but it does work:

if any(city in row[0] and not print(city) for city in aCities):
    # city in row[0] found, and already printed :)
    # do whatever else you might want to do
else:
    # no city name found in the accounting record

or more concisely, if all you really want to do is print the city:

if not any(city in row[0] and not print(city) for city in aCities):
    # no city name found in the accounting record

It works for three reasons:

  1. any stops at the first true (truthy) item,
  2. and is short-circuiting, so not print(city) will only be eval'd if city in row[0] is true, and
  3. print returns None, so not print(...) is always True.

PS: As @falsetru points out, in Python 2.x print isn't a function, so you'll have to first say:

from __future__ import print_function

As I said, it works for 3 reasons - Python 3 reasons ;) Oh, wait - that's 4 reasons...

BrianO
  • 1,496
  • 9
  • 12
  • Clever, but `print` in Python 2.x is a statement, not a function. You'd better to mention about `from __future__ import print_function`. +1 – falsetru Aug 01 '15 at 07:55
2

My answer will work only for Python >= 3.8. You could use the Walrus Operator (:=) to achieve this:

for row in cxTrx.fetchall() :
    if any( (t_city := city) in row[0] for city in aCities ) :
        #print the name of the city that fired the any() function
        print(t_city)
    else :
        # no city name found in the accounting record

Note that the parentheses around t_city := city are important, because if not put, t_city would get the value True, which is the value of the expression city in row[0] in the final iteration (in case the any gets triggered)

The Walrus operator was introduced in Python 3.8: What's new in Python 3.8

1

For completeness, here is a solution with a standard for-loop:

for city in aCities:
    if city in row[0]:
        print 'Found city', city
        break
else:
    print 'Did not find any city'

This should have the same short-circuit behavior as any, since it breaks out of the for-loop when the condition is fulfilled. The else part is executed when the for-loop runs till the end without breaking, see this question.

Although this solution uses more lines, it actually uses less characters than the other solutions, since there is no call to next(..., None), it does not have the extra city = assignment and there is no second if city is None (at the cost of one extra break). When things get more complicated, it is sometimes clearer to write out the for-loop explicitly, then to string together some generator expressions and next statements.

Community
  • 1
  • 1
Bas Swinckels
  • 18,095
  • 3
  • 45
  • 62
  • Thank you @bas-swinckels. The code should process circa 10 million records and the cities list will be circa 500 items. I love your approach in terms of clarity but what do you think about the performance? – Luis U. Aug 01 '15 at 09:37
  • 1
    @LuisU. that really depends on your application, the only way to know is to profile your code (e.g. with the `timeit` command in IPython, or with the [timeit](https://docs.python.org/2/library/timeit.html) module. If in case of simply printing something, you do some heavy computation, all the rest if negligible. I could also imagine that talking to your database will take much more time than doing a simple for-loop over 5 cities. – Bas Swinckels Aug 01 '15 at 09:49
  • Please [don't start optimizing](http://c2.com/cgi/wiki?PrematureOptimization) before you know that there is a problem, try to write clear and maintainable code first. But if you do want to optimize, start by optimizing the algorithm, e.g. you might change `for city in list_of_cities` (which has computation time linear with the length of the list), by `for city in set_of_cities` (which has constant running time). Only if it is still to slow then, start profiling all lines, and start worrying about the speed of Python loops vs built-in functions like `any`. – Bas Swinckels Aug 01 '15 at 09:50