1
    open      high    low   close   volume  date      time  
0   0.9738    0.9742    0.9738  0.9740  48  2009-09-27  1900-01-01 18:00:00  
1   0.9738    0.9739    0.9737  0.9737  11  2009-09-27  1900-01-01 18:01:00  
2   0.9733    0.9734    0.9733  0.9734  6   2009-09-27  1900-01-01 18:02:00  
3   0.9734    0.9734    0.9734  0.9734  1   2009-09-27  1900-01-01 18:03:00  
4   0.9735    0.9735    0.9735  0.9735  1   2009-09-27  1900-01-01 18:04:00  

I have a large dataframe that looks like above (1-minute intraday bars of a stock price).

question: how do i get the next bar of prices if there are no prices for the 1-minute bar at 8:00:00 am? I want to be able to get the open price for each day where the time = 8:00:00 am, or the next closest price after.

The function below gets the open, cumulative high, cumulative low, and the close reconstructed from some continuous section of bars (continually from a set open time to a set close time).

def getOpenHighLowClose(x=None, openTime=None, closeTime=None):
    x.loc[(x['time']==openTime), 'openPriceOfDay'] = x['open']  
    x.loc[(x['time']==closeTime), 'closePriceOfDay'] = x['close']  

    x['openPriceOfDay']=x['openPriceOfDay'].fillna(0)  
    x['closePriceOfDay']=x['closePriceOfDay'].fillna(0)  

    x['OpenCashMkt']=x['openPriceOfDay'].max()  
    x['CloseCashMkt']=x['closePriceOfDay'].max()  

    x.loc[(x['time']>=openTime) & (x['time']<=closeTime), 'cumHigh'] =   x['high'].cummax()
    x.loc[(x['time']>=openTime) & (x['time']<=closeTime), 'cumLow'] = x['low'].cummin()

I wrote the code this way so that I could construct my own [open high low close] for any timeframe and use .shift(x) to create a return series using groupby 'date'.

I am a novice so please tell me if I can clarify any further.

Thanks!

  • Back in the "old days" before electronic trading, the "open" was an average of all the trades in the first minute (or few minutes, depending on the exchange). In other words, this is not a new problem. – Thane Plummer Oct 19 '15 at 21:27
  • great, do you have any suggestions of what to search then? I also want to return the resulting time associated with the next first available bar – David Villa Oct 19 '15 at 21:41

1 Answers1

0

You can group by date and take the first open price (assuming the data has already been sorted by time).

df.groupby('date')['open'].first()

You can also set the index to the timestamp:

df.set_index(pd.to_datetime(df['date'] + ' ' + df['time']), inplace=True)

This will allow you to easily access data:

def ohlc(start, end):
    ticks = df.loc[start:end]
    open = ticks['open'].dropna()[0]
    high = ticks['high'].max()
    low = ticks['low'].min()
    close = ticks['close'].dropna()[-1]
    vol = ticks['volume'].sum()
    return pd.Series({'start': start, 'end': end, 'open': open, 'high': high, 'low': low, 'close': close, 'volume': vol})

Also, see Converting OHLC stock data into a different timeframe with python and pandas

Community
  • 1
  • 1
Alexander
  • 105,104
  • 32
  • 201
  • 196