This question is related to posts on creating a new column by mapping/lookup using a dictionary. (Adding a new pandas column with mapped value from a dictionary and pandas - add new column to dataframe from dictionary). However, what if I want to create, multiple new columns with dictionary values.
For argument's sake, let's say I have the following df:
country
0 bolivia
1 canada
2 ghana
And in a different dataframe, I have country mappings:
country country_id category color
0 canada 11 north red
1 bolivia 12 central blue
2 ghana 13 south green
I've been using pd.merge
to merge the the mapping dataframe to my df using country and another index as keys it basically does the job, which gives me my desired output:
country country_id category color
0 bolivia 12 central blue
1 canada 11 north red
2 ghana 13 south green
But, lately, I've been wanting to experiment with using dictionaries. I suppose a related question is how does one determine to use pd.merge
or dictionaries to accomplish my task.
For one-off columns that I'll map, I'll create a new column by mapping to a dictionary:
country_dict = dict(zip(country, country_id))
df['country_id'] = df['country'].map(entity_dict)
It seems impractical to define a function that takes in different dictionaries and to create each new column separately (e.g., dict(zip(key, value1)), dict(zip(key, value2))
). I'm stuck on how to proceed in creating multiple columns at the same time. I started over, and tried creating the country mapping excel worksheet as a dictionary:
entity_dict = entity.set_index('country').T.to_dict('list')
and then from there, converting the dict values to columns:
entity_mapping = pd.DataFrame.from_dict(entity_dict, orient = 'index')
entity_mapping.columns = ['col1', 'col2', 'col3']
And I've been stuck going around in circles for the past few days. Any help/feedback would be appreciated!