2

I am trying to map a dictionary to a dataframe. I dug through some code and pieced together what I thought would work, but the code isn't running. Can anyone help with getting this to map?

Top15 is an existing dataframe with Country as the Index.

ContinentDict  = {'China':'Asia', 
              'United States':'North America', 
              'Japan':'Asia', 
              'United Kingdom':'Europe', 
              'Russian Federation':'Europe', 
              'Canada':'North America', 
              'Germany':'Europe', 
              'India':'Asia',
              'France':'Europe', 
              'South Korea':'Asia', 
              'Italy':'Europe', 
              'Spain':'Europe', 
              'Iran':'Asia',
              'Australia':'Australia', 
              'Brazil':'South America'}




Top15['Continent'] = Top15['Country'].map(ContinentDict)
deuwde
  • 75
  • 1
  • 4
  • 3
    Without raw data, code for your df then this is just speculative, you need to post data and code in order for others to reproduce this – EdChum Nov 28 '16 at 16:42
  • What's not working, can you show your inputs and expected output, e.g. what is `Top15` – AChampion Nov 28 '16 at 16:43
  • Please read [this](http://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples). – Eli Sadoff Nov 28 '16 at 16:43

1 Answers1

4

This ought to work. Here's an example:

In [1]:Top15 = pd.DataFrame({'Country':['France','Brazil', 'Canada', 'Japan']})
      Top15
Out[1]:
  Country
0  France
1  Brazil
2  Canada
3   Japan

Now you can indeed use pd.Series.map using a dict as argument:

In [2]: Top15['Continent'] = Top15['Country'].map(ContinentDict)
        Top15
Out[2]:
  Country      Continent
0  France         Europe
1  Brazil  South America
2  Canada  North America
3   Japan           Asia

Update: now that we know Top15 is indexed by country

The problem is that index.map doesn't allow a dict as an argument. But you can do either of these:

# 1000 loops, best of 3: 696 µs per loop
Top15['Continent'] = Top15.index.to_series().map(ContinentDict)

# 1000 loops, best of 3: 722 µs per loop
Top15['Continent'] = pd.Series(Top15.index).map(ContinentDict)

Or much faster:

# 10000 loops, best of 3: 156 µs per loop
Top15['Continent'] = Top15.index.map(lambda x: ContinentDict[x])
Julien Marrec
  • 11,605
  • 4
  • 46
  • 63