0

I have dataframe #1 looks like the following:

df1

My second dataframe looks like the following:

df2

So the first dataframe uses id and date as multiindex, while the second dataframe uses date and port as multiindex.

I managed to merge the two dataframes by firstly reset_index of both dataframes and then pd.merge(df1,df2,how='outer',on=['date','port']) to achieve the following structure: merged

Lastly, I just set_index of this final structure using id and date and sort on id

My question: is there a way to merge the two dataframes WITHOUT resetting their multiindex, to achieve the structure I want with id and date still being the multiindex?

NOnaMe
  • 27
  • 5
  • if its set on index, try `join`, `combine_first` or `update` better if you add in some data, but I think your answer lies in the marked duplicate – Umar.H Jun 16 '20 at 09:52
  • I tried df1.join(df2,how='outer', on=['date','port']). Now I don't have to reset index before join, neither do I have to set index after join. The only thing I have to do is sort_index on the new dataframe. – NOnaMe Jun 16 '20 at 10:26

1 Answers1

0

You can use this:

new_df = df1.merge(
        df2, how="outer", left_index=True, right_index=True
    )

But this will give you two columns containing port, because df1 has port as column and df2 has it as index. To counter this you can just drop the column port using this:

new_df.drop(['port'], axis=1, inplace=True)
Kaustubh Lohani
  • 635
  • 5
  • 15
  • Thank you, what I got from your method was a three-level multiindex:date id and port. And the final structure ended up with doubling the observations of what it is supposed to be. I guess I will just either reset index before merge or use join. – NOnaMe Jun 16 '20 at 10:31