I have a dataframe like below, with 2 different SecuCode
. I'd need to group them by column SecuCode
, create a new column from Volume
and AShare
, then do a rolling mean.
SecuCode TradingDay Volume AShare
0 600455.SH 2013-01-04 1484606 49717768
1 600455.SH 2013-01-07 1315166 49717768
2 600455.SH 2013-01-08 1675933 49717768
3 600455.SH 2013-01-09 1244098 49717768
4 600455.SH 2013-01-10 751279 49717768
5 600551.SH 2018-12-24 1166098 505825296
6 600551.SH 2018-12-25 3285799 505825296
7 600551.SH 2018-12-26 3534143 505825296
8 600551.SH 2018-12-27 2462501 505825296
9 600551.SH 2018-12-28 2282954 505825296
for a single dataframe, the code will be like:
df['volumn_percentage'] = df['Volume']/df['AShare']
df['turnover'] = df['volumn_percentage'].rolling(2).mean()
I need to do these multi-step operations in each frame from
df_grouped = df.groupby(by='SecuCode')
state, frame = next(iter(df_grouped))
I wonder how can I do the same thing in each group of this dataframe, and then restore to the original dataframe format? there will be ~1000 unique SecuCode
, and there will be ~1000 trading days, so looping through all frame/groups seems to be very slow. I may need to do more operations other than the above 2 (1 column divided by another, rolling mean), so a more general ways is much appreciated.