i have a large dataset and i am computing daily standartd deviation of residual for each ID, the code is correct , however when I compile the code, it just keeps on running and running.
This is my data
this is my code:
the first two lines creates a repetitif output for each ID, that will be displayed in my dataframe in order to compute easily the variance and std by the last 3 codes.
C['mean'] = C.apply(lambda x: C[(C.ID == x.ID)].residual.mean(), axis=1)
C['size']=C.apply(lambda x: C[(C.ID == x.ID)].residual.count(), axis=1)
C['diff2']=(C['residual']-C['Mean'])**2
C['var']=C['diff2']/(B['size']-1)
C['stddev'] = C['var']** 0.5
My question is how to increase the efficiency of this code?