0

I have 4 columns: Country, Year, GDP Annual Growth and Field Size in MM Barrels.

  • I am looking for a way to create a loop function that generates the mean GDP growth values over the 5 years following the discovery of a field ("Field Size MM Barrels"). Example: In 1961 a discovery was made in Algeria and its size is 2462. What is the average GDP annual growth value over the next following 5 years (1962-1967)?.
  • NaN refers to years where no discoveries were made in this case. I would like the loop to add the mean value each time in a column next to Field Size. Any idea how to do that?
Country,Year,GDP Annual Growth,Field_Size_MM_Barrels 
Algeria,1961,-13.605441,2462.0
Algeria,1962,-19.685042,2413.0
Algeria,1963,34.313729,NaN
Algeria,1964,5.839413,NaN
Algeria,1965,6.206898,500.0
Yemen,2016,-13.621458,NaN
Yemen,2017,-5.942320,NaN
Yemen,2018,-2.701475,NaN
Divided Neutral Zone: Kuwait/Saudi Arabia,1963,NaN,832.0
Divided Neutral Zone: Kuwait/Saudi Arabia,1967,NaN,1566.0

# read in with
df = pd.read_clipboard(sep=',')
Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158
Mehdi
  • 1
  • 1
  • Hello, instead of inserting an image, can you create a text table that demonstrates what your dataframe looks like, as well as any code you are have already tried to do this, or anything to get us started? – linamnt May 25 '20 at 04:16
  • Also, this is a duplicate question. See [pandas get column average/mean](https://stackoverflow.com/questions/31037298/pandas-get-column-average-mean) – Trenton McKinney May 25 '20 at 04:33

1 Answers1

0

If you could include a sample of the dataframe (say first 20 rows) then it will help answer/test answers. Here's a possible starting point:

# create a list for average GDP values
average = []
# go over all rows in df.values
for row_id in range(1, len(self.df.values)):
   test = self.df.iloc[row_id]["Field Size MM Barrels"]

   if (test == 'NaN'):
      row_list = []
      # create a row list to average over:
      for i in range(1+row_id,6+row_id):
           row_list.append(i)
      average = df[["GDP"]].iloc[row_list].mean(axis=0)