0

I am trying to plot electrical conductivity values of water for 10 different geographic districts as 10 separate boxplots on a single plot. I want to add asterisks to each boxplot indicating where values significantly differ from 400 (as opposed to significantly differ from each other, or from the mean of all values). My code currently looks like this:

    well.data$ref <- 400


 ggboxplot(well.data, x = "District", y = "Electrical_conductivity", color = "District", 
              add = "jitter", legend = "none") +
        geom_boxplot() +
        geom_text(aes(label = Sig, y = MaxWidth + 0.2), size = 10,
                  data = t_tests)+
      geom_hline(yintercept=400, linetype="dashed", color = "red")+
  stat_compare_means(method = "anova", label.y = 40)+ 
      stat_compare_means(label = "p.signif", method = "t.test",
                         ref.group = "ref") +
      theme(text = element_text(size = 20)) 

This generates the error: Warning messages: 1: Computation failed in stat_compare_means(): missing value where TRUE/FALSE needed 2: Removed 10 rows containing missing values (geom_text).

My data looks roughly like this:

set.seed(42)  ## for sake of reproducibility
n <- 100
well.data <- data.frame(
                  District=rep(LETTERS[1:10], n),
                  Electrical_conductivity=sample(200:500, n, replace=TRUE),
ref=400, n))
  • It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. Since we don't have any data, we can't run this to test it. – MrFlick Mar 23 '22 at 20:09
  • Added a reproducible example, hope it helps. – Shannon O'Hara Mar 23 '22 at 20:18

1 Answers1

0

I don't know if I understand correctly your question but if you would like to run a comparison of each group against base-mean you can use stat_compare_means(label = "p.signif", method = "t.test",ref.group = ".all.")

Sample code:

#t_tests <- with(well.data, pairwise.t.test(Electrical_conductivity, District, p.adjust.method="bonferroni"))$p.value
#t_tests<-data.frame(t_tests) # this is missing from your sample data


  ggboxplot(well.data, x = "District", y = "Electrical_conductivity", color = "District", add = "jitter", legend = "none") +
  geom_boxplot() +
  #geom_text(aes(label = Sig, y = MaxWidth + 0.2), size = 10, data = t_tests)+ # missing the t_test data
  geom_hline(yintercept=400, linetype="dashed", color = "red")+
  stat_compare_means(method = "anova", label.y = 40)+ 
  stat_compare_means(method = "t.test", label = "p.signif", 
                     ref.group = ".all.")  # .all. formultiple pairwise tests against all (base-mean)

Plot:

enter image description here

Well, your ref.group is on the x-axis so what you could do to compare with "A" by ref.group = "A"

enter image description here

Sample data:

set.seed(42)  ## for sake of reproducibility
n <- 100


well.data <- data.frame(
  District=rep(LETTERS[1:10], n),
  Electrical_conductivity=sample(200:500, n, replace=TRUE)) #  I modified your sample data
Rfanatic
  • 2,224
  • 1
  • 5
  • 21
  • Thank you for your help! This is very close to what I am looking for, except that rather than comparing each group to the base-mean, I want to compare each group to the value 400. Do you know how I could modify the code to do that? – Shannon O'Hara Mar 24 '22 at 14:27