I have a DataFrame, Df2
. I'm trying to check each of the last 10 rows for the column Lead_Lag
below - if there's any value besides null in any of those rows, then I want a new column Position
to equal 'Y'
:
def run_HG_AUDUSD_15M_Aggregate():
Df1 = pd.read_csv(max(glob.iglob(r"C:\Users\cost9\OneDrive\Documents\PYTHON\Daily Tasks\Pairs Trading\HG_AUDUSD\CSV\15M\Lead_Lag\*.csv"), key=os.path.getctime))
Df2 = Df1[['Date', 'Close_HG', 'Close_AUDUSD', 'Lead_Lag']]
Df2['Position'] = ''
for index,row in Df2.iterrows():
if Df2.loc[Df2.index.shift(-10):index,"Lead_Lag"].isnull():
continue
else:
Df2.loc[index, 'Position'] = "Y"
A sample of the data is as follows:
Date Close_HG Close_AUDUSD Lead_Lag
7/19/2017 12:59 2.7 0.7956
7/19/2017 13:59 2.7 0.7955
7/19/2017 14:14 2.7 0.7954
7/20/2017 3:14 2.7 0.791
7/20/2017 5:44 2.7 0.791
7/20/2017 7:44 2.71 0.7925
7/20/2017 7:59 2.7 0.7924
7/20/2017 8:44 2.7 0.7953 Short_Both
7/20/2017 10:44 2.71 0.7964 Short_Both
7/20/2017 11:14 2.71 0.7963 Short_Both
7/20/2017 11:29 2.71 0.7967 Short_Both
7/20/2017 13:14 2.71 0.796 Short_Both
7/20/2017 13:29 2.71 0.7956 Short_Both
7/20/2017 14:29 2.71 0.7957 Short_Both
So in this case I would want the last two values for the new column Position
to be 'Y'
as there are values in the Lead_Lag
column in at least one of the last 10 rows. I want to apply this on a rolling basis - for instance row 13 'Position' value would look at rows 12-3, row 12 'Position' value would look at rows 11-2, etc.
Instead I get the error:
NotImplementedError: Not supported for type RangeIndex
I've tried several variations of the shift method (defining before the loop etc.) and can't get it to work.
edit: Here's the solution:
N = 10
Df2['Position'] = ''
for index,row in Df2.iterrows():
if (Df2.loc[index-N:index,"Lead_Lag"] != "N").any():
Df2.loc[index, 'Position'] = "Y"
else:
Df2.loc[index, 'Position'] = "N"