1

I have a df:

pd.DataFrame(index = ['A','B','C'],
             columns = ['1','2','3','4'], 
             data = [[100,60,40,60],
                     [200,10,50,80],
                     [50, np.nan, np.nan, np.nan]])
        1           2           3           4        
A       100         60          40          60
B       200         10          50          80
C       50          

I would like to calculate the remaining C index values, but each calculation is dependent on the previous value like so:

        1           2           3           4        
A       100         60          40          60
B       200         10          50          80
C       50          A2+B2-C1    A3+B3-C2    A4+B4-C3

I checked this answer and tried the following:

new = [df.loc['C'].values]

for i in range(1, len(df.index)):
    new.append(new[i-1]*df.loc['A'].values[i]+df.loc['B'].values[i]-df.loc['C'].values[i-1])
df.loc['C'] = new

But I get :

ValueError: cannot set a row with mismatched columns

Also, the question and answers are quite outdated, maybe there is a new solution for these recursive functions inside pandas dataframe?

Jonas Palačionis
  • 4,591
  • 4
  • 22
  • 55

1 Answers1

2

Key is : print your variables to ensure they contains what you think


  • First is that new = [df.loc['C'].values] builds a list with one item being an array, you just want one list

  • Then if the loop you're using new[i-1] *, which isn't present in the schema above

  • you use df.loc['C'].values[i-1] but you don't update it (you save in a list) so you can't expect it to work

    • directly update the DF and use - df.loc['C'].values[i-1]
    • keep the temporaty list and use - new[i - 1]
  • you don't want to append, but overwrite the values (or you'd have need to start new with only one value


With a separate list

new = df.loc['C'].to_list()

for i in range(1, len(df.columns)):
    new[i] = df.loc['A'].values[i] + df.loc['B'].values[i] - new[i - 1]

Without a separate list

for i in range(1, len(df.columns)):
    df.iloc[2, i] = df.iloc[1, i] + df.iloc[0, i] - df.iloc[2, i - 1]
azro
  • 53,056
  • 7
  • 34
  • 70
  • What happens if instead of columns being `1,2...` I have `2021-11-29, 2021-12-06 ...`? – Jonas Palačionis Jan 11 '22 at 12:40
  • @JonasPalačionis the code doesn't use the column name at all, it shouldn't cause trouble. Also can't you just try and answer that question by yourself ? ;) – azro Jan 11 '22 at 13:46