0

I have a dataframe (e.g. df_EDA) with 64 columns, how can i plot multiple boxplots and quickly identify the outliers and remove them?

As of now, when I run the boxplot: sns.boxplot(data=df_EDA)

It appears as follows: enter image description here

ricenix
  • 21
  • 3
  • You'll need a thorough definition for outliers, that works well for your data and the intended use case. A one-size-fits-all solution doesn't exist. Also, you'll probably don't want to put columns with values that have a very different range all together in one subplot. – JohanC Jan 28 '23 at 14:01
  • It is likely, that the outlier is always be the first box. If this is the case just remove it. – Nejc Jan 28 '23 at 16:19

0 Answers0