-2

I am working on a project with financial data. In this case I have two data frames, one containing fundamental values and the other with prices. Because I had to use two different databases the dfs differ in size, mainly due to mismatches in companies ('tic') and/or dates ('year').

Here is an example for fund_df:

        tic       assets      year
0      AAPL       123.99      1999
1      AAPL       143.20      2000
..     ...          ...       ...
200    GMCF         9.56      2013
201    GMCF        11.21      2014
..     ...          ...       ...
1543   TSLA       201.23     2015
1544   TSLA       233.49     2016

Whereas the prices_df looks like this:

        tic       prices      year
0      MPRD       56.789      2000
1      MPRD       48.222      2001
..     ...          ...       ...
200    GM         87.991      2012
201    GM        102.334      2013
..     ...          ...       ...
1543   ZZ         34.567      2017
1544   ZZ         29.887      2018

My goal is to unify these two dfs so that I can work with both fundamental and price values. However, because I am afraid that the datasets my differ both under a tic and year point of view, I wish to unify these values based on a match that needs to be at both levels, so that the fundamental values of TSLA in 2015 match the price of TSLA in 2015.

I have tried the following piece of code:

merged_df = merge(fund_df, prices_df, by.fund_df=['tic', 'year'], by.prices_df['tic', 'year'], all = TRUE)

However I keep on receiving this error message:

SyntaxError: keyword can't be an expression

Can anyone help me find a solution to merge the two dfs based on both a tic and year constraint?

Thank you in advance for your help, I appreciate it!

tigio_33
  • 149
  • 8
  • You should use `left_on` and `right_on` to specify the fields to join on. – Chris Apr 04 '20 at 08:19
  • Also, `all` isn't a valid argument for `merge`. Do you mean to specify the join type, (e.g. `how='outer'`)? – Chris Apr 04 '20 at 08:20
  • @Chris thank you for your comment :) Could you please code it out so that I can understand what you mean? I am not too fluent with python, so I would appreciate something more visual! – tigio_33 Apr 04 '20 at 08:22
  • 1
    Posted as an answer so I can format the code with comments in a more readable manner. – Chris Apr 04 '20 at 08:30

1 Answers1

1

As per the documentation, the syntax for pandas.merge is:

merged_df = merge(
    fund_df,                     # left dataframe
    prices_df,                   # right dataframe
    left_on=['tic', 'year'],     # columns to join on in left dataframe
    right_on=['TICKER', 'year'], # columns to join on in right dataframe
    how='outer')                 # type of join (e.g. inner, outer, left, etc.)
Chris
  • 1,618
  • 13
  • 21