0

I'd like to group a dataset by an ID column that I set up. So I have: df_grouped = df_grouped.groupby(by='groupID').apply(create_ohlc)

and create_ohlc is the following:

def create_ohlc(data):
    data['open'] = data.loc[0, 'price']
    data['high'] =data.loc[:, 'price'].max()
    data['low'] = data.loc[:, 'price'].min()
    data['close'] = data.loc[-1, 'price']
    return data

I could fix it by doing like that: def create_ohlc(data): data['open'] = data.loc[data.index[0], 'price'] data['high'] =data.loc[:, 'price'].max() data['low'] = data.loc[:, 'price'].min() data['close'] = data.loc[data.index[-1], 'price'] return data

But I still don't understand what is going on. And it takes a bit of time to get it done. Is there something wrong?

Alexandre Tavares
  • 113
  • 1
  • 1
  • 11

2 Answers2

1

If you want to use a combination of integer indexing and label indexing, it's better to go with the integer indexing and get the index of your labels read docs here

data['open'] = data.iloc[0, data.columns.get_loc('price')]
Sina Meftah
  • 133
  • 9
0

This is because loc gets rows by labels. On the other hand, iloc gets rows by index.

See also How are iloc and loc different?

Alexander Volkovsky
  • 2,588
  • 7
  • 13