-3

I have thousands of products with ingredients of each for example:

ProductID  | Ingredients
 00001     |  itemA, itemB, itemC, itemD
 00002     |  itemF, itemD, itemG, itemA, itemI
 00003     |  itemH, itemI, itemD, itemF, itemT,itemB, itemC

........ and so on.

I want to make a unique list of ingredients and make a map that what ingredients are in which product. So For example I want the resulting output in the following way:

{itemA: [00001,00011, 00005,00007]}
{itemB: [00003, 00002, 000056]}
{itemC: [00009, 00087, 00044, 00647, 00031, 00025]}

So the list size will be different for each item. Can somebody help me out in solving this problem? Thanks

muazfaiz
  • 4,611
  • 14
  • 50
  • 88

1 Answers1

1

Assuming its a text file, it could be something like this:

from collections import defaultdict

product_ingredients_mapping = defaultdict(list)
file_data = open('products.txt')

for row in file_data.readlines():
    data = row.split('|')
    ingredients = data[1].split(',')
    product_id = data[0].strip()
    for ingredient in ingredients:
       product_ingredients_mapping[ingredient.strip()].append(product_id)
Sirius
  • 736
  • 2
  • 9
  • 22