Domain: Python & Pandas
I have a time series data frame which has the total number of customers for each day for the last 10 years.
The columns are:
- date
- total customers
There are outliers in my total customers column.
I wanted to reset the outliers outside of 3 standard deviations above the mean to a value as defined by the formula below.
Outlier which is above 3SD = Mean + 3 S.D.