I am trying to clip outliers in the DataFrame based on quantiles for each column. Let's say
df = pd.DataFrame(pd.np.random.randn(10,2))
0 1
0 0.734355 0.594992
1 -0.745949 0.597601
2 0.295606 0.972196
3 0.474539 1.462364
4 0.238838 0.684790
5 -0.659094 0.451718
6 0.675360 -1.286660
7 0.713914 0.135179
8 -0.435309 -0.344975
9 1.200617 -0.392945
I currently use
df_clipped = df.apply(lambda col: col.clip(*col.quantile([0.05,0.95]).values))
0 1
0 0.734355 0.594992
1 -0.706865 0.597601
2 0.295606 0.972196
3 0.474539 1.241788
4 0.238838 0.684790
5 -0.659094 0.451718
6 0.675360 -0.884488
7 0.713914 0.135179
8 -0.435309 -0.344975
9 0.990799 -0.392945
This works but I am wondering if there is a more elegant pandas/numpy based approach.