Create dictionary from multiple rows in dataframe

Question

I have a dataframe like so:

I would like to create a dictionary that looks like this:

dict = {'car' : ['mazda', 'toyota', 'ford'],
        'bike' : ['honda', 'kawasaki', 'suzuki']
       }

I have tried a number of answers found on stackoverflow, including this one: dict(df.values), that I found at Convert a Pandas DataFrame to a dictionary, but this gave me this error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In [38], line 1
----> 1 dict(df.values)

TypeError: 'dict' object is not callable

This is part of an assignment. The instructor left a hint in the assignment suggesting they are expecting a x for x in df type solution.

Any help would be appreciated.

Don't use `dict` as variable name, you're shadowing the built-in class and that's why you get the error. — fsimonjetz, Nov 16 '22 at 22:08
Thanx @fsimonjetz, that was right, I didn't even think of that. I removed it and now ```dict(df.values)``` does not produce an error. However, the dictionary it produces only gives me this: ```{'car': 'mazda', 'bike': 'kawasaki'}``` — opperman.eric, Nov 16 '22 at 22:17
Since this is an assignment I don't wanna give you the solution, but you know from the hint it has to be a comprehension, a dictionary comprehension in particular. You might wanna look into what you learned so far and perhaps you find something that puts rows with the same value in one column together into groups.. — fsimonjetz, Nov 16 '22 at 22:19
@fsimonjetz, everything I have for dictionary comprehension deals with items that are already in a dictionary. For instance, if the above had 1 row of car in the item column, and then a dict of ford, mazda and volkswagen in the name column. I am struggling to apply this to my dataframe example — opperman.eric, Nov 17 '22 at 00:39

score 0 · Answer 1 · answered Nov 17 '22 at 02:25

0

This is a potential solution to the above:

df.groupby(['item'])['name'].apply(lambda grp: list(grp.value_counts().index)).to_dict()

answered Nov 17 '22 at 02:25

opperman.eric

314
1
14

`groupby` is certainly the right direction. I think the key is (and it might not be so obvious) that `groupby` objects are iterable. Try `for elem in df.groupby("item")["name"]: print(elem)`. It gives you tuples of a string (e.g. "car") and a series. Series have a `.to_list()` method, too. – fsimonjetz Nov 17 '22 at 09:10

Create dictionary from multiple rows in dataframe

1 Answers1