I am trying to create a mapping of the list element values to the index. For example, given a pandas dataframe like this:
>>> book_df
name genre
0 Harry Potter ["fantasy", "young adult"]
1 Lord of the Rings ["fantasy", "adventure", "classics"]
2 I, Robot ["science fiction", "classics"]
3 Animal Farm ["classics", "fantasy"]
4 A Monster Calls ["fantasy", "young adult"]
I want to generate a dict which maps the genre to the list of movies that are under that genre.
So, what I'm trying to get is something like this:
>>> genre_to_book_map
{
"fantasy": ["Harry Potter", "Lord of the Rings", "Animal Farm", "A Monster Calls"],
"young adult": ["Harry Potter", "A Monster Calls"],
"classics": ["Lord of the Rings", "I, Robot", "Animal Farm"],
"science fiction": ["I, Robot"],
"adventure": ["Lord of the Rings"]
}
I've managed to do this in a rather long-winded way by exploding the list then creating a dictionary out of it (based off Pandas column of lists, create a row for each list element and Pandas groupby two columns then get dict for values) like so:
exploded_genres = pd.DataFrame({
"name" :np.repeat(book_df["name"].values, book_df["genres"].str.len())
}).assign(**{"genres":np.concatenate(book_df["genres"].values)})
genre_to_name_map = exploded_genres.groupby("genres")["name"].apply(lambda x: x.tolist())
but I'd like to know if there was a more efficient way of doing this as it seems like a relatively simple thing to do