0

I have two separate dataframes and i would like to merge them based on a specific column. I have tried to concat the two but it just places the dataframes on top of each other. and i have tried pd.merge but it duplicates the other columns

Here is what i have...

Current Dataframe

I want something similar to this where the dates are merged and the symbols share the date. enter image description here

Here is some dummy code if you'd like to implement an example

df_msft = [['2020-1-1', 10], ['2020-1-2', 15], ['2020-1-3', 14]]
df1 = pd.DataFrame(df_msft , columns = ['datetime', 'price'])

df_aapl = [['2020-1-1', 10], ['2020-1-2', 15], ['2020-1-3', 14]]
df2 = pd.DataFrame(df_aapl , columns = ['datetime', 'price'])
  • One thing to be careful about here is the difference between the data in a dataframe and how that data is displayed to a user. You can easily create a dataframe which repeats the date in each line and then display that in a table format like you show in the screenshot. – Code-Apprentice Mar 22 '22 at 21:43
  • What do you mean by "it duplicates the other columns" when you tried pd.merge? Which "other" columns are duplicated? – Code-Apprentice Mar 22 '22 at 21:44
  • Lets say i merged it based on the 'datetime' column, after merging it would duplicate the columns with the same name which are all of them such as: Ticker_x, Ticker_y – LemonSqueezy Mar 22 '22 at 21:48
  • I think I get what you mean. So if both dataframes have a column named `high`, then the "merged" data frame will have two columns, one from each of the original dataframes? – Code-Apprentice Mar 22 '22 at 22:26

1 Answers1

1

You can use pd.concat:

out = pd.concat([df1, df2], keys=['MSFT', 'AAPL']).droplevel(1) \
        .rename_axis('symbol').set_index('datetime', append=True) \
        .swaplevel().sort_index()
print(out)

# Output
                 price
datetime symbol       
2020-1-1 AAPL       10
         MSFT       10
2020-1-2 AAPL       15
         MSFT       15
2020-1-3 AAPL       14
         MSFT       14

Export to excel:

out.to_excel('output.xlsx', merge_cells=True)

enter image description here

Corralien
  • 109,409
  • 8
  • 28
  • 52