1

I am facing issues with stat_compare_means, geom_pwc, and stat_pvalue_manual. The issues are as follows:

  1. stat_compare_means actually works decently for the most part, except for the fact that I cannot seem to get rid of the brackets of NS significance. hide.ns only gets rid of the label itself but not the brackets, and I cannot seem to get those brackets to not appear.

  2. With both geom_pwc and stat_pvalue_manual, I read that hide.ns also hides brackets. However, when I use either of these functions with hide.ns=TRUE, all of my significances are removed (despite not all of them being ns).

My data is a df with three relevant variables: willingness (y-axis), gender (x-axis), and risk (facet). My plot is a boxplot. There are three facets (risk: 0.1%, 2%, and 10%), four gender categories (men, women, non-binary, and general), and willingness is a continuous 0-100 scale. The "general" gender is just the combined data of the other three gender categories, to observe the general trend non-gender-specific, and should be left out of the statistical comparisons.

My code is as follows:

ggboxplot(combined_data,x = "gender", y = "willingness", color = "black", fill="gender", palette = c("coral4","#BBD1EA","#FFBB99","#FFF6BD"), facet.by = "Risk", short.panel.labs = FALSE, outlier.shape = NA) +
geom_point(aes(x=gender,y=willingness,fill=gender),position=position_jitterdodge(),alpha=0.3,size=1,shape=20) + 
theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust=1)) + 
theme(legend.position="none", axis.text.x=element_blank(),axis.ticks.x=element_blank(),axis.title.x = element_blank()) + 
stat_compare_means(comparisons = my_comparisons, label="p.signif", hide.ns = TRUE)

Where, in stat_compare_means, "my_comparisons" is:

my_comparisons <- list(c("Men", "Women"), c("Men", "Non-binary"), c("Women", "Non-binary"))

This produces the first image attached below.

using stat_compare_means

When I try to use geom_pwc or stat_pvalue_manual, they both produce the same figure. Note: this one does not seem to allow me to specify the comparisons, so "general" is included here. If anyone knows how to specify comparisons, that would be great. The code is:

ggboxplot(combined_data,x = "gender", y = "willingness", color = "black", fill="gender", palette = c("coral4","#BBD1EA","#FFBB99","#FFF6BD"), facet.by = "Risk", short.panel.labs = FALSE, outlier.shape = NA) + 
geom_point(aes(x=gender,y=willingness,fill=gender),position=position_jitterdodge(),alpha=0.3,size=1,shape=20) + 
theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust=1)) + 
theme(legend.position="none", axis.text.x=element_blank(),axis.ticks.x=element_blank(),axis.title.x = element_blank()) + 
geom_pwc(group.by="x.var",label="p.signif",hide.ns=TRUE)

And the figure is found below.

using geom_pwc, hide.ns=true

And this is what it looks like with hide.ns=false, noting the present significance levels:

hide.ns=false

Using stat_pvalue_manual produces the same result as geom_pwc, noting the following code:

stat.test <- compare_means(willingness ~ gender, data = combined_data, 
              group.by = "Risk")
ggboxplot(.......same as above) + 
stat_pvalue_manual(stat.test, label = "p.signif", hide.ns = TRUE)

So if anyone can help with either:

  1. Removing ns brackets with stat_compare_means, or
  2. Specifying comparisons and fixing the hide.ns issue with geom_pwc or stat_pvalue_manual

I would really appreciate it. Thanks!

Mark
  • 7,785
  • 2
  • 14
  • 34
  • Hi Saud! Welcome to StackOverflow. If you could make your example reproducible, that would be great! See: https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – Mark Jul 12 '23 at 03:18
  • I didn't know that `ggpubr::ggboxplot` existed but I have done a grouped box plot using `ggplot::geom_boxplot` that I think would fit your specifications. Let me know if you're interested in seeing a code example and I'll post an answer in the next day or so. – Reed Merrill Jul 12 '23 at 04:07

1 Answers1

0

You can just automatically compute p-values before ploting and based on that decide on the comparisons. In my case stat_compare_means usus wilcoxin so I use seurat findMarkers which is quick and also has wilcoxin. Like in my example:

comparisonsi <- list(c("AL", "DR"), c("AL", "SR100"))

pval_AL_vs_DR <- FindMarkers(sne_subset, ident.1 = "AL", ident.2 = "DR", features = c(gene), min.pct = 0, logfc.threshold = 0)[gene, "p_val"]
pval_AL_vs_SR100 <- FindMarkers(sne_subset, ident.1 = "AL", ident.2 = "SR100", features = c(gene), min.pct = 0, logfc.threshold = 0)[gene, "p_val"]

if (pval_AL_vs_DR<=0.05 & pval_AL_vs_SR100<=0.05) {comparisonsi <- list(c("AL", "DR"), c("AL", "SR100"))}
if (pval_AL_vs_DR<=0.05 & pval_AL_vs_SR100>0.05) {comparisonsi <- list(c("AL", "DR"))}
if (pval_AL_vs_DR>0.05 & pval_AL_vs_SR100<=0.05) {comparisonsi <- list(c("AL", "SR100"))}
if (pval_AL_vs_DR>0.05 & pval_AL_vs_SR100>0.05) {comparisonsi <- list()}
    
    
p_violin <- VlnPlot(sne_subset, features = gene, group.by = "diet", pt.size = 0, log = T) + 
stat_compare_means(comparisons = comparisonsi, method = "wilcox.test", label = "p.signif", hide.ns = T)