2

I want to create a dict from 2 columns of a dataframe.

Let's say they look like this:

A         B
car1     brand1
car2     brand2
car3     brand1
car4     brand3
car5     brand2

output:

{'brand1': ['car1', 'car3'], 'brand2': ['car2', 'car5'], 'brand3': 'car4'}

There is to_dict method, however when i try to use it, i can't get it to add values to keys, instead it only maps 1 value to 1 key.

I know I can for loop column A, check value in column B with iloc and then make if else to either create a new key or add a value to existing key, but I am looking for an elegant solution.

Alex
  • 73
  • 6
  • 1
    what's an "elegant solution" when a solution isn't possible? Dicts must have unique keys. Maybe use dicts with lists as value. – Paritosh Singh Feb 05 '20 at 23:16
  • @ParitoshSingh the keys are unique. Keys are brands, and there can be mutliple values (cars) per key. In column B when for example brand1 appears for a second time, we just add car3 to an already existing key. – Alex Feb 05 '20 at 23:21
  • 2
    Your output isn't valid python – G. Anderson Feb 05 '20 at 23:22
  • fixed to valid python so it's clear for everyone – Alex Feb 05 '20 at 23:32

1 Answers1

10

Borrowing from grouping rows in list in pandas groupby you can aggregate to list with a groupby, then use to_dict()

df.groupby('B')['A'].apply(list).to_dict()
{'brand1': ['car1', 'car3'], 'brand2': ['car2', 'car5'], 'brand3': ['car4']}
G. Anderson
  • 5,815
  • 2
  • 14
  • 21