0

I am trying to loop over a dataframe to check whether if 3 consecutive indexes have the following condition:

df.loc[idx, "GDP"] > df.loc[idx+1, "GDP"] > df.loc[idx+2, "GDP"]

Once satisfied, it means we have recession.

On iterating over it using:

for idx, gdp in df.iterrows():
        if (df.loc[idx, "GDP"]>df.loc[idx+1, "GDP"]>df.loc[idx+2, "GDP"]) and (idx<=length-2):
            print(df.loc[idx, "Quarter"], df.loc[idx, "GDP"], len(df.index)-3)

I am adding another condition in case idx is at it max which 65 (we have 66 rows), to iterate only intil idx=63 and add 2 to it at the final iteration to compare the last 3 values.

I am receiving the correct results, but at the end I am having an error saying:

'the label [66] is not in the [index]'

When I splitted the both if into nested ones, it worked properly:

for idx, gdp in df.iterrows():
        if (idx<=length-2):
            if (df.loc[idx, "GDP"]>df.loc[idx+1, "GDP"]>df.loc[idx+2, "GDP"]):
                print(df.loc[idx, "Quarter"], df.loc[idx, "GDP"], len(df.index))

But I need them to be at the same if condition.

alim1990
  • 4,656
  • 12
  • 67
  • 130
  • In the code you are checking index condition after indexing dataframe. Index condition should be first like `if (idx<=length-2) and (df.loc[idx, "GDP"]>df.loc[idx+1, "GDP"]>df.loc[idx+2, "GDP"]):`......................But you should avoid loop with dataframes. – Dishin H Goyani Aug 26 '20 at 04:45
  • @DishinHGoyani what do you suggest? I am new to python. – alim1990 Aug 26 '20 at 05:14
  • See how can you use [`shift`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.shift.html#pandas-series-shift)... the Given answer is good to go – Dishin H Goyani Aug 26 '20 at 05:19

1 Answers1

1

Try avoiding the loop

recession = (
    df.GDP.gt(df.GDP.shift(-1)) &
    df.GDP.gt(df.GDP.shift(-2))
)

Pandas and numpy have optimized C implementations that are more efficient than python loops.

Read more in the docs and this question

RichieV
  • 5,103
  • 2
  • 11
  • 24