Primitive programmer here. I have been tasked with cleaning medical data which is stored in csv format.
(please keep in mind while you read this that I am just a beginner programmer so your patience is appreciated)
I have a file, we'll call it data1, which looks like this: data1. It has ~17,000 rows/patients
inc_key refers to a unique patient ID.
I have another file, which we'll call data2, which is identical in format except with different information stored in it, however it contains MILLIONS of rows/patients.
My goal is, for each row/patient in data1, I need to find the matching patient (inc_key value) in data2, and then append (add columns to the end of that patient) the corresponding information to the same patient in data1.
In other words, I need to merge these two files, except the inc_key values need to match.
I am using the pandas module, can anyone help me with this?
Thank you in advance to anyone who helps, it is sincerely appreciated since I am only a beginner programmer.