I'm trying to normalize a pandas dataframe while grouping it based on the dates.
My dataset looks like this:
date | permno | ret | cumret | mom1m | mom3m | mom6m |
---|---|---|---|---|---|---|
2004-01-30 | 80000 | 0.053 | 1.497 | 0.067 | 0.140 | 0.137 |
2004-02-29 | 80000 | 0.053 | 1.497 | 0.067 | 0.140 | 0.137 |
2004-03-31 | 80000 | 0.053 | 1.497 | 0.067 | 0.140 | 0.137 |
2004-01-30 | 80001 | 0.053 | 1.497 | 0.067 | 0.140 | 0.137 |
2004-02-29 | 80001 | 0.053 | 1.497 | 0.067 | 0.140 | 0.137 |
2004-03-31 | 80001 | 0.053 | 1.497 | 0.067 | 0.140 | 0.137 |
I'm trying to scale mom1m
, mom3m
, mom6m
based on the dates.
So the first row should be scaled with the 4th row, the second row should be scaled with the 5th row, the third row should be scaled with the last row.
What I've tried is
crsp2[scale_cols] = crsp2.groupby('date')[scale_cols].apply(lambda x: StandardScaler().fit_transform(x))
where crsp2
is the dataframe i'm trying to scale and scale_cols
is the list of features I'm trying to scale.