How to subtract columns of one dataframe to that of another in python and store the result and both columns in a new dataframe

Question

I’ve 2 data frames which have one common column ‘X’ having all the unique values. I want to subtract each column of data frame 1 to that of data frame 2 which have similar names. dataframe 1

X	bar A	bar B
A1	48.0	1235
A2	53.0	4567

dataframe 2

X	par A	par B
A2	74.0	8342
A1	63.0	6531

Desired result:

X	bar A	par A	Difference	Status
A1	48.0	63.0	15.0	Ok
A2	53.0	74.0	21.0	Ok

To be noted that Difference here refers to barA - parA I.e subtraction. and this kind of data frames are required for each column There are 20 columns in my data frames. The status column gets not ok when value is less Than 0.

I’m thinking of writing a for loop to perform this but unable to start. If anyone can help in this. Ps I’m working on python.

Does this answer your question? [Pandas Merging 101](https://stackoverflow.com/questions/53645882/pandas-merging-101) — BigBen, Feb 07 '22 at 16:31
How are your columns named? There is a common prefix or suffix to match columns pairwise? — Corralien, Feb 07 '22 at 16:33
@Corralien first, this is too broad of a question. Second, clearly OP doesn't know about `merge`. Third, there's not enough detail to propose a solution, which is why you (rightly) asked for more. — BigBen, Feb 07 '22 at 16:34
@Corralien Yes columns has common suffix in case of both dataframes for eg I need to subtract col1 of df1 with col1 of df2 then both the columns have a common suffix. — user18143261, Feb 07 '22 at 16:46
Did you read the linked thread on how to use `merge`? That's a *lot* of help, not just a bit :-) — BigBen, Feb 07 '22 at 16:47

score 0 · Accepted Answer · answered Feb 07 '22 at 16:48

Yes you can merge on column "X" to create a single dataframe with all values, then iterate through the columns and append calculated dataframes to a list.

Something like:

df = pd.merge(left=df1, right=df2, on='X', how='inner')
output_frames = []
letters = [i[-1] for i in df.columns if not i == 'X'] #Get letters in your columns 
letters = list(set(letters)) #Get list of letters with single element
for letter in letters:
    newdf = df.loc[:, ['X'] + [i for i in df.columns if i.endswith(letter)]]
    newdf['Difference'] = newdf[f'par {letter}'] - newdf[f'bar {letter}']
    newdf['Status'] = newdf.apply(lambda x: 'OK' if x.Difference >= 0 else 'NOT OK', axis=1)
    output_frames.append(newdf)

Then view output:

for df in newdf:
    print(df)

Or better still, merge the outputs:

df = pd.concat(output_frames, axis=1)

Thank you @AlecZ this has helped a lot – user18143261 Feb 07 '22 at 17:42 — user18143261, Feb 07 '22 at 17:42

score 0 · Answer 2 · answered Feb 07 '22 at 17:10

Your expected output and logic is a bit ambiguous, I'll assume here that you want to subtract the column of df2 that has the same trailing letter as the column in df1. Then compute the status if all differences are positive:

df3 = df1.merge(df2, on='X').set_index('X')

(df3.groupby(df3.columns.map(lambda x: x.split()[1]), axis=1)
    .apply(lambda d: d.diff(axis=1).iloc[:,-1])
    .add_prefix('diff_')
    .assign(Status=lambda d: np.where(d.ge(0).all(1), 'OK', 'Not OK'))
)

output:

     diff_A  diff_B Status
X                         
A1     15.0  5296.0     OK
A2     21.0  3775.0     OK

How to subtract columns of one dataframe to that of another in python and store the result and both columns in a new dataframe

2 Answers2