1

I'm doing a db query as such select fruits from warehouse. The goal is to create a dictionary with the fruits in each warehouse:

{13: [apple, orange, grapes], 14: [banana, pineapple], 20: [strawberry, avocado, blueberry]}

Any fruits that are present in several warehouses I want to remove altogether and I want to print an error message about them. I got a solution that works but it requires several steps:

fruits = set()
duplicates = set()
stock = {}
tmp = []

for warehouse in warehouses:
  for row in results:
    if row[0] in fruits
      print row[0] + " is a duplicate"
      duplicates.add(row[0])
    else
      fruits.add(row[0])
      tmp.append(row[0])
  stock[warehouse] = tmp   
  tmp = []  

final_result = {}

#Remove duplicates
for warehouse,fruits in stock.iteritems():
  final_result[warehouse] = []
  for fruit in fruits:
    if fruit not in duplicates:
      final_result[warehouse].append(fruit)

I like to use dict/list comprehension where I can but that seems ruled out here and the whole approach looks a bit cumbersome, is there a better/cleaner way to achieve the same result?

Lunacy
  • 71
  • 7
  • What is `warehouses`, and what is `results`? Please see how to produce a [mre] and clarify the inputs and expected outputs... – r.ook Dec 23 '19 at 15:50
  • Each 'warehouse' is a different db host on which I execute the query, the result of that query is in results. – Lunacy Dec 23 '19 at 20:40

1 Answers1

1

Yes, there is, and in fact it can be done with comprehensions:

stock = {}

for warehouse in warehouses:
    stock[warehouse] = []
    for row in results:
        stock[warehouse].append(row[0])

fruit_list = [fruit for warehouse in stock.values() for fruit in warehouse]

duplicates = {fruit for fruit in set(fruit_list) if fruit_list.count(fruit) > 1}

for fruit in duplicates:
    print(fruit + " is a duplicate")

final_result = {warehouse:[fruit for fruit in fruits if fruit not in duplicates] for warehouse,fruits in stock.items()}
DarklingArcher
  • 227
  • 1
  • 7
  • Thanks that looks really clean, however the set comprehension for generating duplicates seem quite inefficient as it takes a lot longer to execute than what I had originally. (I have about 50000 fruits) – Lunacy Dec 23 '19 at 20:50
  • I agree that your original code is rather efficient at what it does, the only thing I don't get is why do you use the `tmp` variable instead of appending directly to `stock[warehouse]`. – DarklingArcher Dec 23 '19 at 20:54
  • I just didn't think of it :) So your point of view definitely helped in cleaning up my code! I found a way to optimize your proposal: `{fruit for fruit, number in Counter(fruit_list).items() if number > 1}` (https://stackoverflow.com/questions/52072381/how-to-print-only-the-duplicate-elements-in-python-list) – Lunacy Dec 23 '19 at 21:00