4

I have two df - df_a and df_b,

# df_a
number    cur    code
1000      USD    700
2000      USD    800
3000      USD    900

# df_b
number    amount    deletion code
1000      0.0       L        700
1000      10.0      X        700
1000      10.0      X        700
2000      20.0      X        800
2000      20.0      X        800
3000      0.0       L        900
3000      0.0       L        900

I want to left merge df_a with df_b,

df_a = df_a.merge(df_b.loc[df_b.deletion != 'L'], how='left', on=['number', 'code'])

and also, create a flag called deleted in the merge result df_a, that has three possible values - full, partial and none;

full - if all rows associated with a particular number value, have deletion = L;

partial - if some rows associated with a particular number value, have deletion = L;

none - no rows associated with a particular number value, have deletion = L;

Also when doing the merge, rows from df_b with deletion = L should not be considered; so the result looks like,

 number    amount    deletion    deleted    cur    code
 1000      10.0      X           partial    USD    700
 1000      10.0      X           partial    USD    700
 2000      20.0      X           none       USD    800
 2000      20.0      X           none       USD    800
 3000      0.0       NaN         full       USD    900

I tried,

g = df_b['deletion'].ne('L').groupby([df_b['number'], df_b['code']])
m1 = g.any()
m2 = g.all()

d1 = dict.fromkeys(m1.index[m1 & ~m2], 'partial')
d2 = dict.fromkeys(m2.index[m2], 'full')

d = {**d1, **d2}
df_a = df_a.merge(df_b.loc[df_b.deletion != 'L'], how='left', on=['code', 'number'])

df_a['deleted'] = df_a[['number', 'code']].map(d).fillna('none')

but I got an error,

AttributeError: 'DataFrame' object has no attribute 'map'

It seems df does not have map function, so I am wondering if there are any alternative ways to achieve this.

jpp
  • 159,742
  • 34
  • 281
  • 339
daiyue
  • 7,196
  • 25
  • 82
  • 149
  • 1
    @jpp sry, updated again, i was trying to `df_a['deleted'] = df_a[['number', 'code']].map(d).fillna('none')`, which caused the error, so wondering if there is any other way to do the same thing. – daiyue Aug 08 '18 at 10:56
  • Does this answer your question? [AttributeError: 'DataFrame' object has no attribute 'map'](https://stackoverflow.com/questions/39535447/attributeerror-dataframe-object-has-no-attribute-map) – AMC Feb 08 '20 at 01:06

1 Answers1

7

pd.DataFrame objects don't have a map method. You can instead construct an index from two columns and use pd.Index.map with a function:

df_a['deleted'] = df_a.set_index(['number', 'code']).index.map(d.get)
df_a['deleted'] = df_a['deleted'].fillna('none')

Compatibility note

For Pandas versions >0.25, you can use pd.Index.map directly with a dictionary, i.e. use d instead of d.get.

For prior versions, we use d.get instead of d because, unlike pd.Series.map, pd.Index.map does not accept a dictionary directly. But it can accept a function such as dict.get. Note also we split apart the fillna operation as pd.Index.map returns an array rather than a series.

jpp
  • 159,742
  • 34
  • 281
  • 339