I would like to delete the second matching for each row. Column X1 is the column that we will be matching against, it's always the reference, we don't delete values from X1
Example (starting point) DataFrame df_client:
| Index |Name |email |city |X1 |X2 |X3 |X4 |X5 |
--- |--- |--- |--- |---|---|---|---|---|
| 0 |Mary |Mary@hotmail.com |London |AB1|KD2|AB1| |CM2|
| 1 |john |john@hotmail.com |Tokyo |LK2|LK2| |IG5| |
| 2 |karl |karl@hotmail.com |London |MK6| |MK6| |
| 3 |jasmin |jasmin@hotmail.com|Toronto|UH5|FG6|UH5| | |
| 4 |Frank |Frank@hotmail.com |Paris |PO4| | |PO4|
| 5 |lee |lee@hotmail.com |Madrid |RT3|RT3|WS1| | |
I would like to compare the values X2,X3,X4,X5
always to X1
and that for each row.
When we find a matching value (e.g. row 0
I would like to delete AB1
from X3
). In other words, we always keep the value in X1
and delete the matching value from X2
or X3
or X4
or X5
.
I would like to add that it's a guarantee that each row will have a value in X2
or X3
or X4
or X5
that matches a value in X1
:
The desired result will look like this:
|Index|Name |email |city |X1 |X2|X3 |X4 |X5 |
--- |--- |--- | ---|---|---|---|---|---
| 0 |Mary |Mary@hotmail.com |London |AB1|KD2| | |CM2|
| 1 |john |john@hotmail.com |Tokyo |LK2| | |IG5| |
| 2 |karl |karl@hotmail.com |London |MK6| | | | |
| 3 |jasmin|jasmin@hotmail.com|Toronto|UH5|FG6| | | |
| 4 |Frank |Frank@hotmail.com |Paris |PO4| | | | |
| 5 |lee |lee@hotmail.com |Madrid |RT3|WS1| | |
It's not important but ideally, I would like to be able to move the values to the left if there are empty cells; something like this :
|Index|Name |email |city |X1 |X2 |X3 |X4 |X5 |
--- |--- | ---|--- |---|---|---|---|---
| 0 |Mary |Mary@hotmail.com |London |AB1|KD2|CM2| | |
| 1 |john |john@hotmail.com |Tokyo |LK2|IG5| | | |
| 2 |karl |karl@hotmail.com |London |MK6| | | | |
| 3 |jasmin|jasmin@hotmail.com|Toronto|UH5|FG6| | | |
| 4 |Frank |Frank@hotmail.com |Paris |PO4| | | | |
| 5 |lee |lee@hotmail.com |Madrid |RT3|WS1| | | |
Moving the values to the left is really not important, if you can help me with just deleting the matching values that will be more than enough.
Thank you