-2

i have two dataframes:

df1:
index   count1
A       1
B       1
C       1

df2:
index   count2
B       2
C       2
D       2

And what i want is:

index   count1   count2   diff
A       1        0        1
B       1        2        -1
C       1        2        -1
D       0        2        -2

Tried several examples on pandas' doc and other posts, but none of them works for me. Thanks

QSBK
  • 44
  • 4

3 Answers3

2

We can join outer, then fillna with 0 (downcast='infer') to downcast to int if possible. Then subtract the two columns:

df3 = df1.join(df2, how='outer').fillna(0, downcast='infer')
df3['diff'] = df3['count1'] - df3['count2']

Or concat on axis=1, fillna then subtract:

df3 = pd.concat([df1, df2], axis=1).fillna(0, downcast='infer')
df3['diff'] = df3['count1'] - df3['count2']

df3:

   count1  count2  diff
A       1       0     1
B       1       2    -1
C       1       2    -1
D       0       2    -2

Data and imports:

import pandas as pd

df1 = pd.DataFrame({'count1': {'A': 1, 'B': 1, 'C': 1}})

df2 = pd.DataFrame({'count2': {'B': 2, 'C': 2, 'D': 2}})
Henry Ecker
  • 34,399
  • 18
  • 41
  • 57
0

Use the Concatenate function in pandas for this.

frames = [df1, df2, df3]

result = pd.concat(frames)
Janzaib M Baloch
  • 1,305
  • 2
  • 5
  • 8
-2

How about this:

>>> from datar.all import *
>>> 
>>> df1 = tibble(index=LETTERS[:3], count1=1)
>>> df2 = tibble(index=LETTERS[1:4], count2=2)
>>> 
>>> df1 >> full_join(df2) >> replace_na(0) >> mutate(diff=f.count1-f.count2)
     index    count1    count2      diff
  <object> <float64> <float64> <float64>
0        A       1.0       0.0       1.0
1        B       1.0       2.0      -1.0
2        C       1.0       2.0      -1.0
3        D       0.0       2.0      -2.0

datar is backended by pandas and ports dplyr and related packages from R to python.

I am the author of the package.

Panwen Wang
  • 3,573
  • 1
  • 18
  • 39