0

I’m trying to merge multiple files into one master file so I can perform prediction on the master file. What am I doing wrong here?

import pandas as pd

# reading csv files
d1 = pd.read_csv('/content/drive/My Drive/mimic_ed/addmition.csv')
d2 = pd.read_csv('/content/drive/My Drive/mimic_ed/diagnosis.csv')
d3 = pd.read_csv('/content/drive/My Drive/mimic_ed/icu.csv')
d4 = pd.read_csv('/content/drive/My Drive/mimic_ed/med.csv')
d6 = pd.read_csv('/content/drive/My Drive/mimic_ed/vitalsign.csv')
# using merge function by setting how='outer'
output1 = pd.merge(d1, d2, d3, d4, d5, d6,
            on='patient_id',
                how='outer')

# displaying result
print(output1)

this is the error

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-22-8df50afa1ee0> in <module>
      9 d6 = pd.read_csv('/content/drive/My Drive/mimic_ed/vitalsign.csv')
     10 # using merge function by setting how='outer'
---> 11 output1 = pd.merge(d1, d2, d3, d4, d5, d6,
     12                         on='patient_id',
     13                 how='outer')

TypeError: merge() got multiple values for argument 'on'

I was expecting the files to merge

1 Answers1

0

pandas.merge takes only two positional arguments, but you're supplying six. So d4 is being interpreted as the second keyword argument (which is on). Then later, you're setting on a second time.

To avoid this kind of confusion, I recommend just providing keywords all the time (even for positional arguments):

pd.merge(left=d1, right=d2, on='patient_id', how='outer')

To merge more than two, consider: https://stackoverflow.com/a/44338256/1054322

MatrixManAtYrService
  • 8,023
  • 1
  • 50
  • 61