0

I am working on the GGPLOT (attached here). I want to remove the outlier black dots with number of participants at each level. I do understand that outlier.shape = NA will remove the outliers but how can I add the number of participants at the same levels. I am keeping codes very simple at this point as I will add labels and titles once this query is resolved.

For example, a) upper boxplot instead of 4 outlier dots, I want to add the numbers "55, 67, 89, 90" b) lower boxplot instead of 4 outlier dots, I want to add the numbers "34, 56, 34, 23"

My codes are given below:

ggplot(dist, aes(x=treatment, y=outcome)) + geom_boxplot()+ylim(0,24)+ theme_void()+ coord_flip()

GGPLOT

  • It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. – MrFlick May 04 '20 at 03:27
  • 1
    You'll likely need to determine the outliers externally to `ggplot` and then add something with `geom_text`. – r2evans May 04 '20 at 03:29
  • Can someone please add the codes as I have given the examples which numbers I need in the place of outlier dots? 4 numbers on the upper boxplot and 4 numbers on the lower boxplot. – Chirag Vyas May 04 '20 at 03:42

2 Answers2

2

One solution is to define the outliers first and then use transparency.

Tyr it on the mpg dataset.

library(ggplot2)
library(dplyr)
data(mpg)

mpg %>%
  group_by(drv) %>%
  mutate(outlier = as.numeric(  # so ggplot doesn't complain about alpha being discrete
    !between(cty, 
            quantile(cty)[2] - 1.5*IQR(cty),
            quantile(cty)[4] + 1.5*IQR(cty)))) %>% 
  ggplot(aes(drv, cty, label=cty)) + 
  geom_boxplot(outlier.shape = NA) +
  geom_text(aes(alpha=outlier), show.legend=FALSE) +
  scale_alpha_continuous(range = c(0, 1))

enter image description here

Here the label is the cty variable, but you can replace that with another one that represents the data you're after (number of participants).

Edward
  • 10,360
  • 2
  • 11
  • 26
  • Thanks for your reply. I don't want to add the y axis number to the outlier point. I want to add the manual number (in my case the number of participants). I saw this post before did not find useful in my case. – Chirag Vyas May 04 '20 at 03:58
  • Then use that instead. As long as it is in the dataset, the code will work. – Edward May 04 '20 at 03:58
0

You can add a text field that has a conditional ifelse() in it to add the actual values of your outliers geom_text(aes(label=ifelse((y>'some threshold value'),y,""))))

This sets it to nothing if it is below the threshold and the value of y when the y-value exceeds the outlier threshold.

You can also use a paste( ) function to add some text with the values.

sconfluentus
  • 4,693
  • 1
  • 21
  • 40
  • Hello, thanks for your reply. I got your point but I want to add 2 separate sets of numbers for my upper and lower boxplots. Can you please help me with the exact coding of geom_text? I have already uploaded my codes and example numbers in the question. – Chirag Vyas May 04 '20 at 04:03
  • The easiest way to do what you want is to create another column in your data with the values you would want plotted with your outliers, and then if `y> threshold` reference that `new column` instead of `y` column I included above and it could be greater than v1 and less than v1, you can use and to include upper and lower – sconfluentus May 05 '20 at 05:18