-1

DF 1

| ColA     | Colb           | Stock    | Date       |
| -------- | -------------- | -------- | ---------- |
| A        | 1              | 3        | 2022-26-12 |
| B        | 2              | 3        | 2022-26-12 |
| C        | 3              | 3        | 2022-26-12 |

DF 2

| ColA     | Colb           | Sales    | Date       |
| -------- | -------------- | -------- | ---------- |
| A        | 1              | 1        | 2022-26-12 |
| B        | 2              | 1        | 2022-26-12 |
| C        | 3              | 1        | 2022-26-12 |

Given any number of columns to join on, how do you do Dataframe arithmetic in pandas, for instance if I wanted to subtract the above two Dataframes to get something like this

STOCK AT END OF THE DAY

| ColA     | Colb           | Stock    | Date       |
| -------- | -------------- | -------- | ---------- |
| A        | 1              | 2        | 2022-26-12 |
| B        | 2              | 2        | 2022-26-12 |
| C        | 3              | 2        | 2022-26-12 |

So stock - sales given all the common columns, in this case

Edit: The equivalent SQL code to my problem is:

SELECT
    DF1.ColA,
    DF1.Colb,
    DF1.Date,
    DF1.Stock - coalesce(DF2.Sales, 0)
FROM
    DF1
    LEFT JOIN DF2
        on
            DF1.ColA = DF2.ColA and
            DF1.Colb = DF2.Colb and
            DF1.Date = DF2.Date
NZJL
  • 115
  • 9
Ari Frid
  • 127
  • 7
  • I have checked previous questions and the Pandas documentation, they did not answer my question. I can give you what I want in SQL or I can give wrong answers in asking question itself. But I am asking the question because I do not know how to do this specific operation. – Ari Frid Dec 26 '22 at 19:14
  • 1
    Done can you look over the edit and tell me if there is anything else wrong with the post? – Ari Frid Dec 26 '22 at 19:30
  • Sure! Please see [this SO Q&A](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) for future reference. I would just request you to add different cases which you need help to solve (which you mentioned in the comments of the answer) in the question directly. It will help future users to solve their questions easily when they read this Q&A. Also, you can directly paste the output of `print(dataframe)` in the code block in the question. – medium-dimensional Dec 26 '22 at 19:36

1 Answers1

1

If they have the same number of rows and columns then do something like that:

df3 = df1[['ColA', 'Colb','Date']]
df3['Stock'] = df1.Stock - df2.Sales

However, if they are different merge them then do what you want:

df3= pd.merge(df1, df2, on='ColA', how='inner')
df3['Stock'] = df3.Stock - df3.Sales

In your case, based on your edited question:

df3 = pd.merge(df1, df2,  how='left', left_on=['ColA','Colb','Date'], right_on = ['ColA','Colb','Date'])
#rename the columns as you want
df3.columns=['col1','col2']
#only select columns you want
df3=df3[['col1','col2']]
#then do your subtraction
df3['Stock'] = df3.col1 - df3.col2
NZJL
  • 115
  • 9
  • In practice I am looking for something that does not have the same number of rows or columns. Rather 2 completely different Dataframes with some values in common, so as to get the resulting values, from the common values. – Ari Frid Dec 26 '22 at 19:10
  • How would you do the latter if the column names were the same so lets both of the column names were "Stock" how would you go about doing the same thing? – Ari Frid Dec 26 '22 at 19:21
  • @AriFrid you can use `merge` as shown in updated answer, also [concat](https://pandas.pydata.org/docs/reference/api/pandas.concat.html) use for the same purpose. – NZJL Dec 26 '22 at 19:23
  • But if you merge with the same column name it will change the column name to something strange like stock_x and stock_y no? – Ari Frid Dec 26 '22 at 19:29
  • 1
    you can specify the suffixes as you want using `suffixes=('_df1', '_df2')`, just check the [documentation](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.merge.html) it give you all the details – NZJL Dec 26 '22 at 19:32
  • @AriFrid related name changing; also you can rename the columns after the merge using `df3.columns = ['a', 'b','c']` – NZJL Dec 26 '22 at 19:38