I have two dataframes as follows:
import pandas as pd
df1= pd.DataFrame({'Integer':(1,1,1,1),'Name':('John','Mary','Jason','Ruth')})
print(df1)
Integer Name
0 1 John
1 1 Mary
2 1 Jason
3 1 Ruth
df2= pd.DataFrame({'Integer':(2,2,2,2,2),'Name':('Rob','Jason','Mary','Mark','Leo')})
print(df2)
Integer Name
0 2 Rob
1 2 Jason
2 2 Mary
3 2 Mark
4 2 Leo
I want an operation that will 'merge' these two dataframes in a way that it sums the Integer column for the rows with the same string in the Name column, which are not necessarily on the same row. If the Name doesn't appear on both dfs, I'd like the row to just append at the end. Desired results would be:
print(df3)
Integer Name
0 1 John
1 3 Mary
2 3 Jason
3 1 Ruth
4 2 Rob
5 2 Mark
6 2 Leo
I looked into merging, joining, and concat options, but I believe since my rows don't match I may need something similar to vlookup from Excel? The index is not really relevant, happy to reset it during the process. Thanks in advance!
Edit: pd.concat([df1, df2]).groupby('Name', as_index=False).sum() does not exactly work in my situation. The resulting df3 only contains the rows which are contained both in df1 and df2. The actual dataframe I am working with has many more columns, some strings and some integers, so I'd also have to specify the column in which to perform the .sum().