1

I would like to add multiple data frames together, with differing columns, but if there are similar columns/cells, then the values will be added together.

Here is an example:

Input:

df1

Region Code (Index) Number of police stations Amount of crime Average Age
B 2 2 35
A 1 5 45
C 3 5 56

df2

Region Code (Index) Number of police stations Amount of crime
B 5 2
A 3 5

Expected output:

Region Code (Index) Number of police stations Amount of crime Average Age
B 7 4 35
A 4 10 45
C 3 5 56

Any indication of how to do it or maybe a better way to structure would be greatly appreciated!

chrisC
  • 37
  • 3
  • Your question appears to be missing some details in `df2` – The Singularity Sep 21 '21 at 10:06
  • Please refer here. https://stackoverflow.com/questions/49111859/how-to-merge-two-dataframes-and-sum-the-values-of-columns – Aditya Sep 21 '21 at 10:07
  • Does this answer your question? [Pandas DataFrame merge summing column](https://stackoverflow.com/questions/23361218/pandas-dataframe-merge-summing-column) – mujjiga Sep 21 '21 at 10:09
  • hi @mujjiga not exactly, the actual dataframe is much more complex than that, I will close this question and re-ask with more context, thanks! – chrisC Sep 22 '21 at 00:36

3 Answers3

1

Try this using pd.concat to join the 2 dataframe. Then groupby the Region Code (Index) and aggregate column by sum

df = pd.concat([df1,df2])
df = df.groupby(by="Region Code (Index)", as_index=False).agg({"Number of police stations":"sum", "Amount of crime":"sum", "Average Age":"sum"}
Raymond Toh
  • 779
  • 1
  • 8
  • 27
0

Try concat and groupby with sum:

pd.concat([df1, df2]).groupby('Region Code (Index)', as_index=False, sort=False).sum()

  Region Code (Index)  Number of police stations  Amount of crime  Average Age
0                   B                          7                4         35.0
1                   A                          4               10         45.0
2                   C                          3                5         56.0

I added sort=False to keep the order, B A C.

U13-Forward
  • 69,221
  • 14
  • 89
  • 114
0

you can use pd.concat

result = pd.concat([df1, df2], axis=1, join="inner")