I have multiple pandas dataframes (5) with some common names index. They have different size. I need sum at least 5 different common colum names
(25 in total) from each dataframe and then compare the sums.
Data:
df_files = [df1, df2, df3, df4, df5]
df_files
out:
[ z name ... a b
0 10 DAD ... 4 4
1 10 DAD ... 5 4
2 10 DAD ... 3 6
3 10 DAD ... 9 2
4 10 DAD ... 11 1
... ... ... ... ... ...
7495 <NA> NaN ... 2 0
7496 <NA> NaN ... 5 3
7497 <NA> NaN ... 3 1
7498 <NA> NaN ... 2 0
7499 <NA> NaN ... 4 3
[7500 rows x 35 columns] #The dataframes are like this type but some vary in size.
What I need is to sum some specific common names
and then compare those sums to see if they match and if they do print out and OK, and if they do not see which value is not macthing with the others like: The value from "column name"
from df3
and df4
do not match with the others values and see the expected common value (when the other majority match) and the colums that match(or if they just match do not need to show it, just the common expected value). Maybe some of them will not match each other but the common expected value it will be assume as the value that is repeated the most and if any of them match that print out that the values needs correction to proceed because any of them match and see the values that not match.
I was begining with something like this:
Example:
df = pd.concat([df1["a"].sum(), df2["a"].sum(), df3["a"].sum(), df4["a"].sum(), df5["a"].sum()])
df
out:
a a a a a
0 425 425 426 427 425
or maybe they can be compared as a list of integers.
I will appreciate your attention with this question. I hope I have been specific.