0

I have an excel file that I want to use to update another larger excel file. Both spreadsheets have a serial number column and I want to use the values in the serial number col from spreadsheet A to search for the serial number in spreadsheet B and then update some other values on that row whenever there is a match.

I've put both the spreadsheets into separate dataframes (dfA and dfB) and I was trying to use dfA to search and update dfB with a merge but can't quite figure it out. I'm wondering if a merge is the best way to do this? And if it is how can I do it with a merge? or what is the best way to do it?

Here's a sample of the data:

dfA: name serialNum status
     test1   1234   deployed
     test2   4321   retired
     test3   1111   inventory

dfB: name serialNum status
     test1   1234   NA
     test2   4321   NA
     test3   1111   NA

Basically I have a very large spreadsheet and I need to update the status with another spreadsheet

ssumme
  • 45
  • 5
  • 2
    Update your post with a sample of `dfA` and `dfB`. Do you read [**Pandas Merging 101**](https://stackoverflow.com/q/53645882/15239951)? – Corralien Sep 17 '21 at 21:08
  • I've added a sample of my data and I have not read that but will, thank you! – ssumme Sep 17 '21 at 21:28

1 Answers1

2

You can use merge to update the status from dfA to dfB.

Input data:

>>> dfA
    name  serialNum     status
0  test1       1234   deployed
1  test2       4321    retired
2  test3       1111  inventory

>>> dfB
    name  serialNum  status
0  test1       1234     NaN
1  test2       4321     NaN
2  test3       1111     NaN
dfB['status'] = dfA.merge(dfB[['name', 'serialNum']],
                          on=['name', 'serialNum'],
                          how='right')['status']

The really important parameter here is how the merge is performed.

Output

>>> dfB
    name  serialNum     status
0  test1       1234   deployed
1  test2       4321    retired
2  test3       1111  inventory
Corralien
  • 109,409
  • 8
  • 28
  • 52