So I have two csv: call the first one 'test1.csv'
+---------+---+---+--+
| user_id | A | B | |
+---------+---+---+--+
| 1 | a | f | |
| 2 | b | g | |
| 3 | c | h | |
| 4 | d | i | |
| 5 | e | j | |
+---------+---+---+--+
the second one is 'test2.csv'
+---------+---+---+--+--+
| user_id | C | D | | |
+---------+---+---+--+--+
| 1 | k | r | | |
| 2 | l | s | | |
| 4 | m | t | | |
| 5 | n | u | | |
| 6 | o | v | | |
| 7 | p | w | | |
| 8 | q | x | | |
+---------+---+---+--+--+
*note that not all the id in test1.csv are in test2.csv and vice versa and not ordered
desired output:
+---------+---+---+---+-----+
| user_id | A | B | C | D |
+---------+---+---+---+-----+
| 1 | a | f | k | r |
| 2 | b | g | l | s |
| 4 | d | i | m | t |
| 5 | e | j | n | u |
+---------+---+---+---+-----+
So in essence, i want to merge the pd on user_id and not track the extraneous id
Any help is appreciated :)