0

I have a dataset with a lot of incorrect duplicates on a certain field, in my reproducible example there are duplicates in serial with different color and shape. I have the actual dataframe with the correct color and shape to serial mapped, and need to select the correct rows with that.

Example:

import pandas as pd

items = pd.DataFrame({
    'serial': ['0001', '0001', '0001', '0002', '0002', '0002'],
    'color': ['Blue', 'Red', 'Green', 'Blue', 'Red', 'Green'],
    'shape': ['Square', 'Circle', 'Star', 'Square', 'Circle', 'Star'],
    'more_data': ['G', 'H', 'I', 'J', 'K', 'L'],
    'even_more_data': ['A', 'B', 'C', 'D', 'E', 'F']
})

real = pd.DataFrame({
    'serial': ['0001', '0002'],
    'color': ['Blue', 'Red'],
    'shape': ['Square', 'Circle']
})

Then,

Out[1]: items
    serial  color   shape   more_data   even_more_data
0   0001    Blue    Square  G           A
1   0001    Red     Circle  H           B
2   0001    Green   Star    I           C
3   0002    Blue    Square  J           D
4   0002    Red     Circle  K           E
5   0002    Green   Star    L           F

Out[2]: real
    serial  color   shape
0   0001    Blue    Square
1   0002    Red     Circle

I need to use 'real' to select the correct rows in 'items' so the expected result is:

Out[3]: 
    serial  color   shape   more_data   even_more_data
0   0001    Blue    Square  G           A
4   0002    Red     Circle  K           E
Nick ODell
  • 15,465
  • 3
  • 32
  • 66
geistmate
  • 525
  • 2
  • 5
  • 14

1 Answers1

1

You can use merge:

real.merge(items)                                                                                                                                                                    

output

Out[305]: 
  serial color   shape more_data even_more_data
0   0001  Blue  Square         G              A
1   0002   Red  Circle         K              E
oppressionslayer
  • 6,942
  • 2
  • 7
  • 24
  • That was more simple than I thought. I guess I was searching up the wrong keywords to find the right answer. Thank you. – geistmate Dec 11 '19 at 17:35
  • @geistmate glad i could help! if you like the answer check out this: https://meta.stackexchange.com/questions/5234/how-does-accepting-an-answer-work/5235#5235 – oppressionslayer Dec 11 '19 at 17:57