0

I have two dataframes DfMaster and DfError

DfMaster which looks like:

     Id           Name Building
0  4653     Jane Smith        A
1  3467    Steve Jones        B
2    34        Kim Lee        F
3  4567     John Evans        A 
4  3643   Kevin Franks        S
5   244  Stella Howard        D

and DfError looks like

     Id           Name Building
0  4567     John Evans        A 
1   244  Stella Howard        D

In DfMaster I would like to change the Building value for a record to DD if it appears in the DfError data-frame. So my desired output would be:

     Id           Name Building
0  4653     Jane Smith        A
1  3467    Steve Jones        B
2    34        Kim Lee        F
3  4567     John Evans        DD 
4  3643   Kevin Franks        S
5   244  Stella Howard        DD

I am trying to use the following:

DfMaster.loc[DfError['Id'], 'Building'] = 'DD'

however I get an error:

KeyError: "None of [Int64Index([4567,244], dtype='int64')] are in the [index]"

What have I done wrong?

halfer
  • 19,824
  • 17
  • 99
  • 186
Stacey
  • 4,825
  • 17
  • 58
  • 99
  • Can you please change your question to be a [Good Reproducible Pandas Example](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples)? I.e. with machine readable DataFrame denfinition. – Nils Werner Jul 26 '19 at 11:00

2 Answers2

1

try this using np.where

import numpy as np
errors = list(dfError['id'].unqiue())
dfMaster['Building_id'] = np.where(dfMaster['Building_id'].isin(errors),'DD',dfMaster['Building_id'])
tawab_shakeel
  • 3,701
  • 10
  • 26
1

DataFrame.loc expects that you input an index or a Boolean series, not a value from a column.

I believe this should do the trick:

DfMaster.loc[DfMaster['Id'].isin(DfError['Id']), 'Building'] = 'DD'

Basically, it's telling: For all rows where Id value is present in DfError['Id'], set the value of 'Building' to 'DD'.

NOhs
  • 2,780
  • 3
  • 25
  • 59
Jano
  • 455
  • 2
  • 9