1

I have a large dataframe of stockprice data with df.columns = ['open','high','low','close']

Problem definition: When an EMA crossover happens, i am mentioning df['cross'] = cross. Everytime a crossover happens, if we label the current crossover as crossover4, I want to check if the minimum value of df['low'] between crossover 3 and 4 IS GREATER THAN the minimum value of df['low'] between crossover 1 and 2. I have made an attempt at the code based on the help i have received from 'Gherka' so far. I have indexed the crossing over and found minimum values between consecutive crossovers. So, everytime a crossover happens, it has to be compared with the previous 3 crossovers and I need to check MIN(CROSS4,CROSS 3) > MIN(CROSS2,CROSS1).

I would really appreciate it if you guys could help me complete.

import pandas as pd    
import numpy as np    
import bisect as bs

data = pd.read_csv("Nifty.csv")    
df = pd.DataFrame(data)    

df['5EMA'] = df['Close'].ewm(span=5).mean()    
df['10EMA'] = df['Close'].ewm(span=10).mean()    
condition1 = df['5EMA'].shift(1) < df['10EMA'].shift(1)    
condition2 = df['5EMA'] > df['10EMA']    
df['cross'] = np.where(condition1 & condition2, 'cross', None)    
cross_index_array = df.loc[df['cross'] == 'cross'].index

def find_index(a, x):    
    i = bs.bisect_left(a, x)    
    return a[i-1]

def min_value(x):
    """Find the minimum value of 'Low' between crossovers 1 and 2, crossovers 3 and 4, etc..."""    
    cur_index = x.name    
    prev_cross_index = find_index(cross_index_array, cur_index)    
    return df.loc[prev_cross_index:cur_index, 'Low'].min()

df['min'] = None    
df['min'][df['cross'] == 'cross'] = df.apply(min_value, axis=1)    
print(df)
smci
  • 32,567
  • 20
  • 113
  • 146
Kartick
  • 65
  • 6
  • It helps if you say you're trying to find where the 5-day Exponential Moving Average crosses the 10-day EMA. Can't you use pandas builtin [`ewm`](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.ewm.html)? – smci Jan 18 '19 at 21:01
  • Related: [Simple Python Pandas EMA (ewma)?](https://stackoverflow.com/questions/48613151/simple-python-pandas-ema-ewma) – smci Jan 18 '19 at 21:05
  • If the current crossover is considered as crossover 4, I want to find the minimum value of low between crossover 3 and 4 and check if it is greater than the minimum value of low between crossover 1 and 2 continually, everytime a crossover happens and considering the current crossover to be crossover4. – Kartick Jan 18 '19 at 21:11
  • Kartick: Welcome to SO. Please edit the statement of the question, into the question itself, not comments underneath it. Is that the original question or a second followup question? (I can't see how the existing code corresponds to that) – smci Jan 18 '19 at 21:14
  • Gherka below has already helped me a lot with the problem. It's just the final nailing thats remaining. my problem statement kinda changed along the way. I will edit the question now. – Kartick Jan 18 '19 at 21:17
  • Also, since your function `min_value(x)` is non-trivial, recommended Python practice is to add a docstring e.g. `"""Find the minimum value of 'Low' between crossovers 1 and 2, crossovers 3 and 4, etc..."""` – smci Jan 18 '19 at 21:17

2 Answers2

1

This should do the trick:

import pandas as pd

df = pd.DataFrame({'open': [1, 2, 3, 4, 5],
                   'high': [5, 6, 6, 5, 7],
                   'low': [1, 3, 3, 4, 4],
                   'close': [3, 5, 3, 5, 6]})

df['day'] = df.apply(lambda x: 'bull' if (
    x['close'] > x['open']) else None, axis=1)

df['min'] = None
df['min'][df['day'] == 'bull'] = pd.rolling_min(
    df['low'][df['day'] == 'bull'], window=2)

print(df)

#    close  high  low  open   day   min
# 0      3     5    1     1  bull   NaN 
# 1      5     6    3     2  bull     1
# 2      3     6    3     3  None  None
# 3      5     5    4     4  bull     3
# 4      6     7    4     5  bull     4

Open for comments!

mommermi
  • 982
  • 10
  • 18
  • Thanks a lot. I was able to build on this and I have posted another question. It would be great if you could have a look.https://stackoverflow.com/questions/54241264/finding-minimum-value-between-2-conditions-in-a-data-frame – Kartick Jan 17 '19 at 17:27
  • Do you mean an element-wise comparison of the columns, or a comparison across all rows of the columns? – mommermi Jan 17 '19 at 17:56
  • element wise. Also pd.rolling_min doesn't work for me. Is there an alternative? It says pandas doesn't have attribute rolling_min() – Kartick Jan 17 '19 at 18:17
  • if `rolling_min` does not exist, you should probably upgrade your version of pandas – mommermi Jan 17 '19 at 18:20
0

If I understand your question correctly, you need a dynamic "rolling window" over which to calculate the minimum value. Assuming your index is a default one meaning it's sorted in the ascending order, you can try the following approach:

import pandas as pd
import numpy as np
from bisect import bisect_left

df = pd.DataFrame({'open': [1, 2, 3, 4, 5],
                   'high': [5, 6, 6, 5, 7],
                   'low': [1, 3, 2, 4, 4],
                   'close': [3, 5, 3, 5, 6]})

This uses the same sample data as mommermi, but with low on the third day changed to 2 as the third day should also be included in the "rolling window".

df['day'] = np.where(df['close'] > df['open'], 'bull', None)

We calculate the day column using vectorized numpy operation which should be a little faster.

bull_index_array = df.loc[df['day'] == 'bull'].index

We store the index values of the rows (days) that we've flagged as bulls.

def find_index(a, x):
    i = bisect_left(a, x)
    return a[i-1]

Bisect from the core library will enable us to find the index of the previous bull day in an efficient way. This requires that the index is sorted which it is by default.

def min_value(x):
    cur_index = x.name
    prev_bull_index = find_index(bull_index_array, cur_index)
    return df.loc[prev_bull_index:cur_index, 'low'].min()

Next, we define a function that will create our "dynamic" rolling window by slicing the original dataframe by previous and current index.

df['min'] = df.apply(min_value, axis=1)

Finally, we apply the min_value function row-wise to the dataframe, yielding this:

    open    high    low     close   day     min
0   1       5       1       3       bull    NaN
1   2       6       3       5       bull    1.0
2   3       6       2       3       None    2.0
3   4       5       4       5       bull    2.0
4   5       7       4       6       bull    4.0
gherka
  • 1,416
  • 10
  • 17
  • Hey, your solution was perfect. It's exactly what i wanted. The last line gives an error. Can you please help me with that? I have added my code. Can you please take a look? – Kartick Jan 18 '19 at 16:07
  • Hard to tell exactly without sample data, but the way you do the indexing is one thing to throw up an error: `df.loc[prev_cross_index, cur_index, df['Low']].min()`. Try `df.loc[prev_bull_index:cur_index, 'low'].min()` instead. The first part inside the square brackets (`prev_bull_index:cur_index`) specifies the rows using slice notation and the second part (`'low'`) is the label (or name) of the columns you want to return. – gherka Jan 18 '19 at 18:02
  • I have attached sample data. I have also made the change with respect to 'low'. – Kartick Jan 18 '19 at 18:35
  • I am not sure you will be able to copy the sample data from here though. Basically any stock price data with open, high, low and close will do. – Kartick Jan 18 '19 at 18:47
  • Hm...I got some test data from the FT and the updated script ran without any errors. Can you please post the error you're getting now as the one in the original post is still referencing the wrong indexing? – gherka Jan 18 '19 at 19:05
  • Hey, it works fine now. I just need 1 final thing. I want to create a new column, df[trade'] which says 'buy' when current min > previous min. Can you help me with this final bit? – Kartick Jan 18 '19 at 20:13
  • Honestly, I can't thank you enough. This has been terrific. thanks a lot. – Kartick Jan 18 '19 at 20:23
  • Glad it worked for you. Can you accept the answer, please? For the 'buy' column just filter the DF to where 'min' isn't NA, create 'buy' column with your condition in a [loop](https://stackoverflow.com/questions/34855859/is-there-a-way-in-pandas-to-use-previous-row-value-in-dataframe-apply-when-previ) and merge on index. – gherka Jan 19 '19 at 12:04