0

I'm very new to Python, but am trying to do some basic automating of some company processes.

One of the processes is a basic forecast, where the numerical average growth is added to the last month's volume to get a forecast of the next month. This is then iterated over to get a longer term view.

I have already created:

  • My dataframe
  • My latest_row variable which pulls the last row in the data frame
  • My numerical_avg_growth variable which calculates the numerical average growth

This is basically the process I want to create:

Calculate: latest_row['month'] + 1 (to get the next month number) latest_row['volume'] + numerical_avg_growth

Append the above calculations as a row to the main dataframe (and therefore this then becomes the latest_row variable)

And then repeat the process again

Meaning that you're constantly adding to the dataframe and calculating based on the most recent addition.


I can't figure out how to run the loop once, append that result, and then run the loop again based on that newly appended figure.

I'm sure this is fairly straightforward but I can't figure it out. Any help would be really appreciated!!

Data and Code Example:

#Creating dataframe 
data = [[2022, 1, 512116, NaN], [2022, 2, 524775, -1.73], 
[2022,3, 600017, 19.88]]

df = pd.DataFrame(data, columns=['year', 'month', 'volume', 
'volume_growth'])
#Creating average volume growth percentage variable
avg_vol_growth = df['volume_growth'].mean() 

#Creating average volume growth as integer variable
avg_vol_growth_int = np.sum(avg_vol_growth/100) 

#Variable that shows just the last row in the dataframe
latest_row = df.tail(1)

#Creating numerical average growth variable (vol)
#Calculated using (latest actual volume * average growth integer)/12 
numerical_avg_growth = np.sum(((latest_row['volume'])*avg_vol_growth_int)/12)

The result I need is, for example, to add the volume for months 4,5 and 6 to the dataframe. Calculated by adding the numerical_avg_growth to the volume of the previous month.

e.g year month volume 2022 1. 512116 2022. 2. 524775 2022. 3. 600017 2022. 4. (600017 + numerical_avg_growth) 2022. 5. (month 4 volume + numerical_avg_growth) 2022. 6. (month 5 volume + numerical_avg_growth)

Paul
  • 91
  • 1
  • 7
lilyf96
  • 1
  • 1
  • 2
    Welcome to SO. Please provide a [minimal reproducible example](https://stackoverflow.com/help/minimal-reproducible-example) of your [data](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) and code to really understand what you're trying to achieve. Text descriptions are ambiguous. – mozway Jul 21 '23 at 15:09
  • Have just added some more context to the original question @mozway :) – lilyf96 Jul 21 '23 at 15:29
  • Consider first calculating the volume and avg_growth and eventually creating the dataframe (see [here](https://stackoverflow.com/questions/13784192/creating-an-empty-pandas-dataframe-and-then-filling-it/56746204#56746204). ```data = {"year": [2022]*6, "month": [range(1, 7)], "volume": [512116, 524775, 600017], "volume_growth": [0, -1.73, 19.88]} for i in range(4, 7): avg_growth = np.mean(data["volume_growth"][1:]) data["volume"].append(data["volume"][-1] + avg_growth) data["volume_growth"].append(data["volume"][-1] - data["volume"][-2])``` – Paul Jul 21 '23 at 16:06
  • Hi Paul, thanks so much for that advice! I think I can do something along those lines! Really helpful thanks again! – lilyf96 Jul 21 '23 at 16:13
  • Why is there a negative in `volume_growth`? The volume always increases. – Reinderien Jul 22 '23 at 16:22
  • If my answer was helpful and answered your question, it would be nice if you could accept it! :) – Paul Jul 24 '23 at 09:58

1 Answers1

0

Consider calculating volume and avg_growth before creating the dataframe for better efficiency and performance. Check out this helpful post: Link

data = {
"year": [2022]*6, 
"month": [range(1, 7)], 
"volume": [512116, 524775, 600017], 
"volume_growth": [0, -1.73, 19.88]
} 
for i in range(4, 7):     
   avg_growth = np.mean(data["volume_growth"][1:])     
   data["volume"].append(data["volume"][-1] + avg_growth)     
   data["volume_growth"].append(data["volume"][-1] - data["volume"][-2])
Paul
  • 91
  • 1
  • 7