Pandas merge, right file has more than one instance of the same key

Question

Context: working with a medical database (CSV)

So I have a set of data that looks like this. This will be the 'left' file during the merge.

The 'right' file looks like this.

I need to merge the two CSVs on 'INC_KEY' (unique 9-digit number assigned to each patient), however the 'right' file may contain more than one instance of the same INC_KEY (i.e. more than one TMODE for the same patient).

How do I merge the files such that all of the TMODEs for each patient will be in the resulting merged file? The column names would be: TMODE1, TMODE2, TMODE3, etc.

[How to make good reproducible pandas examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) — It_is_Chris, Jun 29 '21 at 19:06
Is what I provided not good enough? I'm very new to this, the screenshots tell you everything you need to know. My apologies if it is not ideal. — Sean Roudnitsky, Jun 29 '21 at 19:19
Rather than picture, can you extract a sample of your dataframe as text? — Corralien, Jun 29 '21 at 19:24
Images are not sufficient. Did you read the answer to the link I posted about including a small example DataFrame as runnable code? — It_is_Chris, Jun 29 '21 at 19:24

score 0 · Answer 1 · 2021-06-29T20:30:18.327

0

You need to read both tables by pandas as dataframes and then merge them based on your criteria which is matching up both tables and return patients who have multiple modes in the right table:

import pandas as pd
table1 = pd.read_csv('left.csv')
table2 = pd.read_csv('right.csv')
table = pd.merge(table1,table2,on='INC_KEY',how='outer')
table.pivot(index='INC_KEY', columns='TMODE', values=['TRANS_BLOOD_24HOURS','TRANS_BLOOD_MEASURE_DESC','TRANS_BLOOD_CONV','TRANTYPE'])

To understand more about how parameter in the merge function read this please: here

edited Jun 29 '21 at 20:30

answered Jun 29 '21 at 19:50

How is it incomplete? I might misunderstand you. – Jun 29 '21 at 20:03
*The column names would be: TMODE1, TMODE2, TMODE3, etc.* You have to pivot TMODE. – Corralien Jun 29 '21 at 20:14
Yes, I understood now the idea. I edited the solution and I hope that it works now. – Jun 29 '21 at 20:31

Pandas merge, right file has more than one instance of the same key

1 Answers1