2

I have dataframe_a and dataframe_b filled with an variable number of columns but the same number of rows.

I need to subtract each column of dfb from all dfa columns and create a new dataframe containing the subtracted values.

Right now I'm doing this manually:

sub1 = dfa.subtract(dfb[0], axis = 0)
sub2 = dfa.subtract(dfb[1], axis = 0)
sub3 = dfa.subtract(dfb[2], axis = 0)
etc

then I'm using the concat function to concatenate all the columns:

subbed = pd.concat([sub1, sub2, sub3],axis=1,ignore_index=True)
subbed = pd.concat([dfa, subbed),axis = 1)

This all seems horribly inefficient and makes me feel quite bad a programming lol. How would you do this without having to subtract each column manually and directly write the results to a new dataframe?

BioProg
  • 153
  • 2
  • 11
  • Nested loops can do the trick, can't they? If you want the code even smaller you could use the module 'itertools', but I think it's overkill in your case. – Dodilei Mar 09 '21 at 16:52
  • I'm not sure how to do that with two dataframes -- seems to only work with series or lists. – BioProg Mar 09 '21 at 17:04

1 Answers1

2

Setup

import pandas as pd
import numpy as np
from itertools import product

dfa = pd.DataFrame([[8, 7, 6]], range(5), [*'ABC'])
dfb = pd.DataFrame([[1, 2, 3, 4]], range(5), [*'DEFG'])

Pandas' concat

I use the operator method rsub with the axis=0 argument. See this Q&A for more information

pd.concat({c: dfb.rsub(s, axis=0) for c, s in dfa.items()}, axis=1)

   A           B           C         
   D  E  F  G  D  E  F  G  D  E  F  G
0  7  6  5  4  6  5  4  3  5  4  3  2
1  7  6  5  4  6  5  4  3  5  4  3  2
2  7  6  5  4  6  5  4  3  5  4  3  2
3  7  6  5  4  6  5  4  3  5  4  3  2
4  7  6  5  4  6  5  4  3  5  4  3  2

Numpy's broadcasting

You can play around with it and learn how it works

a = dfa.to_numpy()
b = dfb.to_numpy()
c = a[..., None] - b[:, None]

df = pd.DataFrame(dict(zip(
    product(dfa, dfb),
    c.reshape(5, -1).transpose()
)))

df

   A           B           C         
   D  E  F  G  D  E  F  G  D  E  F  G
0  7  6  5  4  6  5  4  3  5  4  3  2
1  7  6  5  4  6  5  4  3  5  4  3  2
2  7  6  5  4  6  5  4  3  5  4  3  2
3  7  6  5  4  6  5  4  3  5  4  3  2
4  7  6  5  4  6  5  4  3  5  4  3  2
piRSquared
  • 285,575
  • 57
  • 475
  • 624