6

I am trying to create a new column in an dataframe, by creating a dictionary based on an existing column and calling the 'map' function on the column. It seemed to be working for quite some time. However, the notebook started throwing

AttributeError: 'DataFrame' object has no attribute 'map'

I haven't changed the kernel or the python version. Here's the code i am using.

dict= {1:A,
       2:B,
       3:C,
       4:D,
       5:E}

# Creating an interval-type 
data['new'] = data['old'].map(dict)

how to fix this?

James Z
  • 12,209
  • 10
  • 24
  • 44
redwolf_cr7
  • 1,845
  • 4
  • 26
  • 30
  • Does this answer your question? [AttributeError: 'DataFrame' object has no attribute 'map'](https://stackoverflow.com/questions/39535447/attributeerror-dataframe-object-has-no-attribute-map) – AMC Feb 08 '20 at 01:06

3 Answers3

8

map is a method that you can call on a pandas.Series object. This method doesn't exist on pandas.DataFrame objects.

df['new'] = df['old'].map(d)

In your code ^^^ df['old'] is returning a pandas.Dataframe object for some reason.

  • As @jezrael points out this could be due to having more than one old column in the dataframe.
  • Or perhaps your code isn't quite the same as the example you have given.

  • Either way the error is there because you are calling map() on a pandas.Dataframe object

Arran Duff
  • 1,214
  • 2
  • 11
  • 23
3

Main problem is after selecting old column get DataFrame instead Series, so map implemented yet to Series failed.

Here should be duplicated column old, so if select one column it return all columns old in DataFrame:

df = pd.DataFrame([[1,3,8],[4,5,3]], columns=['old','old','col'])
print (df)
   old  old  col
0    1    3    8
1    4    5    3

print(df['old'])
   old  old
0    1    3
1    4    5

#dont use dict like variable, because python reserved word
df['new'] = df['old'].map(d)
print (df)

AttributeError: 'DataFrame' object has no attribute 'map'

Possible solution for deduplicated this columns:

s = df.columns.to_series()
new = s.groupby(s).cumcount().astype(str).radd('_').replace('_0','')
df.columns += new
print (df)
   old  old_1  col
0    1      3    8
1    4      5    3

Another problem should be MultiIndex in column, test it by:

mux = pd.MultiIndex.from_arrays([['old','old','col'],['a','b','c']])
df = pd.DataFrame([[1,3,8],[4,5,3]], columns=mux)
print (df)
  old    col
    a  b   c
0   1  3   8
1   4  5   3

print (df.columns)
MultiIndex(levels=[['col', 'old'], ['a', 'b', 'c']],
           codes=[[1, 1, 0], [0, 1, 2]])

And solution is flatten MultiIndex:

#python 3.6+
df.columns = [f'{a}_{b}' for a, b in df.columns]
#puthon bellow
#df.columns = ['{}_{}'.format(a,b) for a, b in df.columns]
print (df)
   old_a  old_b  col_c
0      1      3      8
1      4      5      3

Another solution is map by MultiIndex with tuple and assign to new tuple:

df[('new', 'd')] = df[('old', 'a')].map(d)
print (df)
  old    col new
    a  b   c   d
0   1  3   8   A
1   4  5   3   D

print (df.columns)
MultiIndex(levels=[['col', 'old', 'new'], ['a', 'b', 'c', 'd']],
           codes=[[1, 1, 0, 2], [0, 1, 2, 3]])
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • 3
    Thank you for the detailed answer! The problem was indeed because of the duplicate columns. I would never have doubted that. So i was calling pd.concat to call get_dummies on 4 different columns. I was doing this in 4 different pd.concat calls. However, i changed it to make it a single pd.concat call and passed the dataframe and columns attributes to get_dummies. For some reason, this change started creating duplicate columns and i had to revert this change. – redwolf_cr7 Feb 10 '19 at 18:22
0
import pandas as pd
f_dict = {1:0,2:1,3:2}
m = pd.Series([1,2,3])
res = m.map(f_dict)
print(res)

It's ok because m is pd.Series object. The following usage is wrong because m is pd.DataFrame object.

import pandas as pd
f_dict = {1:0,2:1,3:2}
m = pd.DataFrame([1,2,3])
res = m.map(f_dict)
print(res)
chutian
  • 1
  • 2