I have a single dataframe containing an ID column id
, and I know that the ID will exist either exactly in one row ('mismatched') or two rows ('matched') in the dataframe.
- In order to select the mismatched rows and the pairs of matched rows I can use a
groupby
on the ID column. - Now for each group, I want to take some columns from the second (pair) row, rename them, and copy them to the first row. I can then discard all the second rows and return a single dataframe containing all the modified first rows (for each and every group).
- Where there is no second row (mismatched) - it's fine to put NaN in its place.
To illustrate this see table below id=1
and 3
are a matched pair, but id=2
is mismatched:
entity id partner value
A 1 B 200
B 1 A 300
A 2 B 600
B 3 C 350
C 3 B 200
The resulting transformation should leave me with the following:
entity id partner entity_value partner_value
A 1 B 200 300
A 2 B 600 NaN
B 3 C 350 200
What's baffling me is how to come up with a generic way of getting the matching partner_value
from row 2, copied into row 1 after the groupby, in a way that also works when there is no matching id.