0

I am trying to merge two dataframes based on the Date column but in df1 the data column is titled Index while the other is called Date.

df1

Index SMB HML RF
2018 2 3 4
2019 4 4 5
2020 4 5 2

df2

Date ABC DEF GHI
2018 22 38 49
2019 41 42 59
2020 41 54 29

I have tried to set the index in DF1 but i keep getting the error message : "None of ['Index'] are in the columns"

This is the code I have tried:

df1 = df1.set_index('Index').T.set_index('Date').T

df1 data was imported if that changes anything. I would like to eventually merge the two dataframes so it looks something like this:

df3

Date ABC DEF GHI SMB HML RF
2018 22 38 49 2 3 4
2019 41 42 59 4 4 5
2020 41 54 29 4 5 2
heff
  • 41
  • 5
  • Does this answer your question? [Pandas Merging 101](https://stackoverflow.com/questions/53645882/pandas-merging-101) –  Nov 21 '21 at 02:21

1 Answers1

1
df3 = df2.merge(df1, right_on='Index', left_on='Date').drop('Index', axis=1)

Output:

>>> df3
   Date  ABC  DEF  GHI  SMB  HML  RF
0  2018   22   38   49    2    3   4
1  2019   41   42   59    4    4   5
2  2020   41   54   29    4    5   2
  • I tried this code and got another error message that just says: KeyError: 'Index' – heff Nov 21 '21 at 00:11
  • Your dataframe(s) must not be exactly like your question says they are. Could you post the result of `print(df1.head().to_dict())` and `print(df2.head().to_dict())` to the question? –  Nov 21 '21 at 00:29
  • I cant post those results because my actual data numbers are much larger and they go over the character limit. Can you tell me what I should be looking for or another way to send the results? – heff Nov 21 '21 at 00:44
  • You can upload a text file to something like https://pastebin.com/ or https://gist.github.com. –  Nov 21 '21 at 00:45
  • https://gist.github.com/notHeff/20fd18a9b283e20ef1b4e7f3223b4a3e – heff Nov 21 '21 at 00:54
  • Do your indexes have names? Try `print(three_yr_monthly_return.index.name)` and `print(three_yr_FF_French.index.name)`. –  Nov 21 '21 at 01:05
  • `three_yr_FF_French.index.name = "your_index_name"` –  Nov 21 '21 at 01:16
  • You'll need to do it for both dataframes first, and then change around `Index` and `Date` in my code samples to work. –  Nov 21 '21 at 01:16