0

I need to perform the following steps on a data-frame:

  1. Assign a starting value to the "balance" attribute of the first row.
  2. Calculate the "balance" values for the subsequent rows based on value of the previous row using the formula for eg : (previous row balance + 1)

I have tried the following steps:

Created the data-frame:

df = pd.DataFrame(pd.date_range(start = '2019-01-01', end = '2019-12-31'),columns = ['dt_id'])

Created attribute called 'balance':

df["balance"] = 0

Tried to conditionally update the data-frame:

df["balance"] = np.where(df.index == 0, 100, df["balance"].shift(1) + 1)

Results: enter image description here

From what I can observe, the value is being retrieved for subsequent update before it can be updated in the original data-frame.

The desired output for "balance" attribute :

  • Row 0 : 100

  • Row 1: 101

  • Row 2 : 102

And so on

Abhi
  • 163
  • 2
  • 14

2 Answers2

2

If I understand correctly if you add this line of code after yours, you are ready:

df["balance"].cumsum()

0      100.0
1      101.0
2      102.0
3      103.0
4      104.0
       ...  
360    460.0
361    461.0
362    462.0
363    463.0
364    464.0

It is a cumulative sum, it sums its value with the previous one and since you have the starting value and then ones it will do what you want.

Billy Bonaros
  • 1,671
  • 11
  • 18
  • 1
    Yes that is what I am looking for ,thank you.i will give it a go and mark as solved. – Abhi Sep 12 '19 at 07:41
  • 1
    Just our of curiosity, in a scenario where I need to use the previous row's balance on a formula to calculate the present row's balance, how would I go about doing that ? – Abhi Sep 12 '19 at 07:51
  • You'll do the same because you only need the starting value. Every value is the previews +1 so you only need a starting value. Am I right? – Billy Bonaros Sep 12 '19 at 07:56
1

The problem you have is, that you want to calculate an array and the elements are dependent on each other. So, e.g., element 2 depends on elemen 1 in your array. Element 3 depends on element 2, and so on.

If there is a simple solution, depends on the formula you use, i.e., if you can vectorize it. Here is a good explanation on that topic: Is it possible to vectorize recursive calculation of a NumPy array where each element depends on the previous one?

In your case a simple loop should do it:

balance = np.empty(len(df.index))
balance[0] = 100
for i in range(1, len(df.index)):
  balance[i] = balance[i-1] + 1  # or whatever formula you want to use

Please note, that above is the general solution. Your formula can be vectorized, thus also be generated using:

balance = 100 + np.arange(0, len(df.index))
AnsFourtyTwo
  • 2,480
  • 2
  • 13
  • 33