I would like to join two files by key but I would like some columns to be joined together for example:
File1:
df1 = pd.DataFrame({'List' : ['P111', 'P999', 'P111;P999;P777', 'P555', 'P666;P111;P333'],
'Color' : ['red', 'red', 'blue','yellow', 'red']})
List Color
0 P111 red
1 P999 red
2 P111;P999;P777 blue
3 P555 yellow
4 P666;P111;P333 red
File2:
df2 = pd.DataFrame({'Cod' : ['P111', 'P222', 'P333', 'P444', 'P555', 'P666', 'P777'],
'Animal' : ['DOG', 'CAT', 'BUG','SNAKE,DOG', 'CAT,BUG', 'DOG', 'SNAKE'],
'Letter' : ['A,F', 'C', 'S,M', 'F,L', 'C,A','M,C', 'Z,L']})
Cod Animal Letter
0 P111 DOG A,F
1 P222 CAT C
2 P333 BUG S,M
3 P444 SNAKE,DOG F,L
4 P555 CAT,BUG C,A
5 P666 DOG M,C
6 P777 SNAKE Z,L
I would like to merge with primary key List (from file1) and Cod (from file2), to get:
List Color Animal Letter
0 P111 red DOG,FROG A,F
1 P999 red - -
2 P111;P999;P777 blue DOG,FROG|-|SNAKE A,F|-|Z,L
3 P555 yellow CAT,BUG C,A
4 P666;P111;P333 red DOG|DOG,FROG|BUG M,C|A,F|S,M
I think we need something like a left join and an agragation but I don't know how. In the final table I would like that in the values not found there was a - . While I would like a | to separate values in "aggregated" columns
Which is the best way?