0

I tried to use seaborn histplot to plot histogram, but I found that when there is an outlier , it will affect the output graph (causing jupyter notebook to crash)

my data:

RESULT         Category
-999           A
0.000711       A
0.0006         A
...            ...
0.00043        B
0.00041        B
...            ...
0.00074        C

I tried to exclude 999 and the plot came out

aaa = test_df_1[test_df_1['RESULT'] != -999]

sns.histplot(data = aaa, x=aaa['RESULT'], hue=aaa['Category'], edgecolor="none")

↑ This works fine

But what I want to do is I don’t want to remove the outlier

sns.histplot(data = test_df_1, x=test_df_1['RESULT'], hue=test_df_1['Category'], edgecolor="none")

↑ This crashes the program

Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158
  • 1
    Please share a minimal reproducible example as your question is impossible to answer – The Singularity Sep 08 '21 at 10:25
  • 1
    does this answer your question? the idea is to clip the values and simply note them down (e.g. in legend) https://stackoverflow.com/a/51050772/3896008 – lifezbeautiful Sep 08 '21 at 11:18
  • 3
    set `binwidth=.01` or `bins=100` or something like that to specify a bin width or count that work for your data. But I don't thin you're going to be able to see more than "one bar for the outlier, one bar for the other values". Are you sure that's a real outlier? Seeing -999 with otherwise very small positive numbers makes me think that's how "missing value" is being coded, or something similar. – mwaskom Sep 08 '21 at 12:44
  • Thanks for all the replies :), @mwaskom. After I added the bins parameter, it was able to execute. Yes, -999 is outlier, and as you said, I can only see "one bar for the outlier, one bar for the other values" which is what I want. – user16859572 Sep 10 '21 at 02:21

0 Answers0