1

I have 2 related lists of dictionaries, items and bookings. I need to determine which item has the least bookings.

The real example is in a database, but for the sake of the excercise, consider this data:

from datetime import datetime

item1 = {'foo1'}
item2 = {'foo2'}
items = [item1, item2]

booking1 = {'Start':datetime(2012,1,1), 'Item':'foo1'}
booking2 = {'Start':datetime(2012,1,2), 'Item':'foo1'}
booking3 = {'Start':datetime(2012,1,1), 'Item':'foo2'}
bookings = [booking1, booking2, booking3]

How can I efficiently determine which item has the fewest bookings? Any help would be greatly appreciated!

MFB
  • 19,017
  • 27
  • 72
  • 118
  • 2
    Is it an sql database? If so, doing the filtering with a select count distinct query will be much faster. If you have to pull it all into Python, it will be O(n) because you will *have* to iterate over the entire list, in addition to the inefficiency of selecting more data than you need from sql to python. – rob05c Mar 07 '12 at 02:11
  • Its NoSQL but there is a `distinct` equivalent. Can you explain more about which distinct value I should query? Sorry, I'm not following you yet. – MFB Mar 07 '12 at 02:14
  • A) If it's in a database, do the work in the database... Databases are very very good at set based problems and there are a bunch of similar questions you might ask about that data that would translate easily into SQL. B) That is a terrible data structure for your data. Can a booking have no more than one item? Why isn't Booking a class? If this is database data, are you not using an ORM? – gfortune Mar 07 '12 at 02:18
  • @MFB in SQL, Select Distinct returns the unique values of a _column_. The first answer in [this thread](http://stackoverflow.com/questions/1346345/mysql-count-occurances-of-distinct-values) has an example of how to get the number of occurrences of each value in a column. I recommend Googling how to do the same in your NoSql database. – rob05c Mar 07 '12 at 02:31
  • @gfortune A) agreed B) I didn't design the data structure – MFB Mar 07 '12 at 02:36

2 Answers2

4
from collections import Counter

# create the counter, from most common to least common. reverse it, and get the first item.
item, num = Counter(b['Item'] for b in bookings).most_common()[::-1][0]

More efficient (courtesy of senderle):

from collections import Counter

c = Counter(b['Item'] for b in bookings)
item = min(c, key=c.get)
Sam Dolan
  • 31,966
  • 10
  • 88
  • 84
  • Under most circumstances, this is probably OK -- but `item = min(c, key=c.get)` will be slightly more efficient (O(n)), since `most_common` performs a sort (O(n log n)). – senderle Mar 07 '12 at 03:30
1

You can do this easily, although not particularly efficiently, with collections.Counter (Python's multiset):

import collections
c = collections.Counter()

for booking in bookings:
    c[booking['Item']] += 1

c.most_common()[:-2:-1]
[('foo2', 1)]
Eduardo Ivanec
  • 11,668
  • 2
  • 39
  • 42