I am writing a piece of simulation software in python using pandas, here is my problem:
Imagine you have two pandas dataframes dfA and dfB with numeric columns A and B respectively. Both dataframes have a different number of rows denoted by n and m. Let's assume that n > m. Moreover, dfA includes a binary column C, which has m times 1, and the rest 0. Assume both dfA and dfB are sorted.
My question is, in order, I want to add the values in B to the values in column A if column C == 0.
In the example n = 6, m = 3.
Example data:
dataA = {'A': [7,7,7,7,7,7],
'C': [1,0,1,0,0,1]}
dfA = pd.Dataframe(dataA)
dfB = pd.Dataframe([3,5,4], columns = ['B'])
Example pseudocode: DOES NOT WORK
if dfA['C'] == 1:
dfD['D'] = dfA['A']
else:
dfD['D'] = dfA['A'] + dfB['B']
Expected result:
dfD['D']
[7,10,7,12,11,7]
I can only think of obscure for loops with index counters for each of the three vectors, but I am sure that there is a faster way by writing a function and using apply. But maybe there is something completely different that I am missing.
*NOTE: In the real problem the rows are not single values, but row vectors of equal length. Moreover, in the real problem it is not just simple addition but a weighted average over the two row vectors