I am working with 2 pandas dataframes. They are linked by the MessageID
. I would normally use a simple map, but if there are multiple hashes for the MessageID
I would like to create a new row for each additional one. If there are multiple senders or recipients I would like additional rows as well. Is there a clever way to accomplish this.
df1
MessageID From To Outcome
0 105176627 sender1@ recipient1@ aborted
1 105176660 sender2@ recipient2@ delivered
2 105176688 sender3@ recipient2@ delivered
3 105176688 sender3@ recipient3@ delivered
4 105176692 sender5@ recipient6@ aborted
5 105176858 sender4@ recipient6@ aborted
df2
MessageID FileName FileHash
0 105176627 you.pdf 4a352399b53c6a4a7f513b1ac8abb95761b0754e7be8
1 105176660 image002.png c732a64515721f261579f74c77e307b07ba2ed4a6674
2 105176660 image004.png 28a5d5ba70ce941d1632f2f5366f6a13153c269ff1f1
3 105176660 image005.jpg c9fb49a33f18924a02f550890c2e3b1d3148e5c585e9
4 105176688 image001.jpg e663cb7d8c59bf591de64660822e9e4087d14f70de55
5 105176692 image001.png d2cdcc83765b564d2fa6ba7a671c4efa21434c974b92
6 105176858 image.pdf a2a6579d54edc2451b4ee2b47743e858be6b8c5452c2
Desired output:
df3
MessageID From To Outcome FileName FileHash
0 105176627 sender1@ recipient1@ aborted you.pdf 4a352399b53c6a4a7f513b1ac8abb95761b0754e7be8
1 105176660 sender2@ recipient2@ delivered image002.png c732a64515721f261579f74c77e307b07ba2ed4a6674
2 105176660 sender2@ recipient2@ delivered image004.png 28a5d5ba70ce941d1632f2f5366f6a13153c269ff1f1
3 105176660 sender2@ recipient2@ delivered image005.jpg c9fb49a33f18924a02f550890c2e3b1d3148e5c585e9
4 105176688 sender3@ recipient2@ delivered image001.jpg e663cb7d8c59bf591de64660822e9e4087d14f70de55
5 105176688 sender3@ recipient3@ delivered image001.jpg e663cb7d8c59bf591de64660822e9e4087d14f70de55
6 105176692 sender5@ recipient6@ aborted image001.png d2cdcc83765b564d2fa6ba7a671c4efa21434c974b92
7 105176858 sender4@ recipient6@ aborted image.pdf a2a6579d54edc2451b4ee2b47743e858be6b8c5452c2