100

I'm trying do something that should be really simple in pandas, but it seems anything but. I'm trying to add a column to an existing pandas dataframe that is a mapped value based on another (existing) column. Here is a small test case:

import pandas as pd
equiv = {7001:1, 8001:2, 9001:3}
df = pd.DataFrame( {"A": [7001, 8001, 9001]} )
df["B"] = equiv(df["A"])
print(df)

I was hoping the following would result:

      A   B
0  7001   1
1  8001   2
2  9001   3

Instead, I get an error telling me that equiv is not a callable function. Fair enough, it's a dictionary, but even if I wrap it in a function I still get frustration. So I tried to use a map function that seems to work with other operations, but it also is defeated by use of a dictionary:

df["B"] = df["A"].map(lambda x:equiv[x])

In this case I just get KeyError: 8001. I've read through documentation and previous posts, but have yet to come across anything that suggests how to mix dictionaries with pandas dataframes. Any suggestions would be greatly appreciated.

CT Zhu
  • 52,648
  • 17
  • 120
  • 133
Rick Donnelly
  • 1,403
  • 3
  • 12
  • 10

1 Answers1

143

The right way of doing it will be df["B"] = df["A"].map(equiv).

In [55]:

import pandas as pd
equiv = {7001:1, 8001:2, 9001:3}
df = pd.DataFrame( {"A": [7001, 8001, 9001]} )
df["B"] = df["A"].map(equiv)
print(df)
      A  B
0  7001  1
1  8001  2
2  9001  3

[3 rows x 2 columns]

And it will handle the situation when the key does not exist very nicely, considering the following example:

In [56]:

import pandas as pd
equiv = {7001:1, 8001:2, 9001:3}
df = pd.DataFrame( {"A": [7001, 8001, 9001, 10000]} )
df["B"] = df["A"].map(equiv)
print(df)
       A   B
0   7001   1
1   8001   2
2   9001   3
3  10000 NaN

[4 rows x 2 columns]
CT Zhu
  • 52,648
  • 17
  • 120
  • 133
  • 3
    Is there a way to do this if your data is string instead of int? This just gives me NaNs for strings. – griffinc May 11 '17 at 02:34
  • Nevermind, see answers here http://stackoverflow.com/questions/20250771/remap-values-in-pandas-column-with-a-dict – griffinc May 11 '17 at 03:20
  • 2
    And what if your `equiv` dict has lists instead of integers? How can you map only the n-th element of that list? – FaCoffee Feb 12 '19 at 12:48
  • 3
    I always get this warning with this method. What is the solution???? SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy after removing the cwd from sys.path. – mah65 Feb 14 '20 at 02:52