1

I have created a boxplot in R Commander, and it generates with a few outliers that are not in the dataset. The highest number in the dataset is 20.5 yet it says it has outliers that as high as 572.

Where are these outliers coming from?

I know I can just hide the outliers but I am worried that if it is using the data wrong then hiding the problem isn't solving anything.

Image of box plot and output

SecretAgentMan
  • 2,856
  • 7
  • 21
  • 41

1 Answers1

1

The number 572 is the index of the point (your dataset has ~594 rows), not the value itself.

Look at the vertical axis. Your highest valued outlier doesn't exceed 20.5, which matches what you said about the maximum value in your data.

OP's graph with red arrow from outlier to vertical axis denoting it is close to the "20" tick line on the vertical axis.

SecretAgentMan
  • 2,856
  • 7
  • 21
  • 41