Python Pandas DataFrame delete rows based on values in multiple previous rows

Question

I have a dataframe with dates as index and one column which is instructions to enter and exit trades. Each row is one of

'short_entry', 'short_exit', 'long_entry', 'long_exit'.

Rules:

1 - You cannot exit a short (short_exit) position if you don't already hold a short position (short_entry). Likewise for long positions.

2 - You can only enter another short posn, if the previous short_entry has been closed with a corresponding short_exit. Likewise with long entry and exits.

Based on the rules the first four rows would be deleted and the first trade entered would be on 2008-02-28 followed by short_exit on 2008-03-27. The rest of the df would be updated accordingly.

I have read pretty much everything I can find in pandas docs and online helps. There are answers to delete rows based on values on a single row above (use .shift()), or use if-statements inside .loc(). But I just cannot get my head around how to put all these together to delete a row based on values of multiple previous rows. I can do it easily using for loops and df.itertuples().

Is there a pandas pythonic way of doing this? Any help and hints would be greatly appreciated.

Thanks

replace your image, as text, it will be more readable, and read [good-reproducible-pandas](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) — Brown Bear, Sep 24 '17 at 12:11

Evgeny · Answer 1 · 2017-09-24T12:54:18.593

0

Your rules imply a state machine, which should result in marking the row index to delete.

I do not think there is one single function in pandas, that picks these rules as arguements.

edited Sep 24 '17 at 12:54

answered Sep 24 '17 at 12:33

Evgeny

4,173
2
19
39

Python Pandas DataFrame delete rows based on values in multiple previous rows

1 Answers1