23

I'm looking to map the value in a dict to one column in a DataFrame where the key in the dict is equal to a second column in that DataFrame

For example:

If my dict is:

dict = {'abc':'1/2/2003', 'def':'1/5/2017', 'ghi':'4/10/2013'}

and my DataFrame is:

      Member    Group      Date
 0     xyz       A         np.Nan
 1     uvw       B         np.Nan
 2     abc       A         np.Nan
 3     def       B         np.Nan
 4     ghi       B         np.Nan

I want to get the following:

      Member    Group      Date
 0     xyz       A         np.Nan
 1     uvw       B         np.Nan
 2     abc       A         1/2/2003
 3     def       B         1/5/2017
 4     ghi       B         4/10/2013

Note: The dict doesn't have all the values under "Member" in the df. I don't want those values to be converted to np.Nan if I map. So I think I have to do a fillna(df['Member']) to keep them?


Unlike Remap values in pandas column with a dict, preserve NaNs which maps the values in the dict to replace a column containing the a value equivalent to the key in the dict. This is about adding the dict value to ANOTHER column in a DataFrame based on the key value.

Henry Ecker
  • 34,399
  • 18
  • 41
  • 57
Windstorm1981
  • 2,564
  • 7
  • 29
  • 57
  • 4
    simply `df['Date'] = df.Member.map(d)` Note, you shouldn't name a dictionary `dict` , since that has a special meaning in Python. See [Pandas.Series.map](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.map.html) – ALollz Aug 16 '18 at 16:27
  • 1
    As per the `Not a duplicate`, there is functionally no difference. Your column seems to be entirely `NaN` so it essentially has no information. By default, `.map` returns `NaN` if for mappings where the key is not in the dictionary, so just map and completely overwrite your `Date` column. On the other hand, if you only wanted to replace values in `Date` for keys in the dictionary (say for instances where `Date` isn't always null, then you can just use `.replace(d)` instead of `.map(d)`. Both are covered in that duplicate. – ALollz Aug 16 '18 at 17:15

5 Answers5

34

You can use df.apply to solve your problem, where d is your dictionary.

df["Date"] = df["Member"].apply(lambda x: d.get(x))

What this code does is takes every value in the Member column and will look for that value in your dictionary. If the value is found in the dictionary than the corresponding dictionary value will populate the column. If the value is not in the dictionary then None will be returned.

Also, make sure your dictionary contains valid data types. In your dictionary the keys (abc, def, ghi) should be represented as strings and your dates should be represented as either strings or date objects.

vielkind
  • 2,840
  • 1
  • 16
  • 16
  • Thanks. Can you give a reference to read up on `get`? Not familiar with it. – Windstorm1981 Aug 16 '18 at 16:56
  • 2
    Method on the the `dict` data structure. Returns the value from the key passed in `get` or a `None` by default or passed value. In the core Python docs. – Jason Strimpel Apr 25 '19 at 08:11
  • 1
    `df["Date"] = df["Member"].apply(lambda x: d.get(x, x))` so if there is no match, you end up with the value that was already in the column – grantr Aug 17 '22 at 21:13
8

I would just do a simple map to get the answer.

If we have a dictionary as

d = {'abc':'1/2/2003', 'def':'1/5/2017', 'ghi':'4/10/2013'}

And a dataframe as:

      Member    Group      Date

 0     xyz       A         np.Nan
 1     uvw       B         np.Nan
 2     abc       A         np.Nan
 3     def       B         np.Nan
 4     ghi       B         np.Nan

Then a simple map will solve the problem.

df["Date"] = df["Member"].map(d)

map() will lookup the dictionary for value in df['Member'], and for each value in Member, it will get the Value from dictionary d and assign it back to Date. If the value does not exist, it will assign NaN.

We don't need to do loop or apply.

Joe Ferndz
  • 8,417
  • 2
  • 13
  • 33
2

if Member is your index, you can assign a Series to the DataFrame:

df.set_index("Member", inplace=True)
df["Date"] = pd.Series(dict)

Pandas will match the index of the Series with the index of the DataFrame.

Gregor Sturm
  • 2,792
  • 1
  • 25
  • 34
-1
for i in range(len(df)):
    if df['Member'][i] in d:
        df['Date'][i] = d[df['Member'][i]]

P.S. it's bad practise to name variables with reserved words (i.e. dict).

bigEvilBanana
  • 388
  • 2
  • 8
-1

Just create a new df then join them:

map_df = pd.DataFrame(list(zip(map_dict.items()))).set_index(0)
df.merge(map_df, how='left', left_on='Member', right_index=True)
PMende
  • 5,171
  • 2
  • 19
  • 26