not enough 'x' or 'y' observations in ggplot2 for t-tests or Wilcoxon test

Question

I'm able to run a t-test or a Wilcoxon test on the data with no warnings, but I get an error when I try to plot in with ggpubr 's function stat_pvalue_manual() in ggplot2

The t-test works fine:

### running a t-test (no problem):

t.test(SUJ_PRE ~ AMOSTRA, data = data, exact = F)

But I get the error in stat_pvalue_manual:

### trying to plot it: 

data %>%
  ggplot(., aes(x = AMOSTRA, y = SUJ_PRE)) +
  stat_boxplot(geom = "errorbar",
               width = 0.15) +
  geom_boxplot(aes(fill = AMOSTRA), outlier.colour = "#9370DB", outlier.shape = 19,
               outlier.size= 2, notch = T) +
  scale_fill_manual(values = c("PB" = "#E6E6FA", "PE" = "#CCCCFF"), 
                    label = c("PB" = "PB", "PE" = "PE"),
                    name = "Amostra:") +
  stat_summary(fun = mean, geom = "point", shape = 20, size= 5, color= "#9370DB") + 
  stat_pvalue_manual(data %>%
                       t_test(SUJ_PRE ~ AMOSTRA) %>%
                       add_xy_position(),
                     label = "t = {round(statistic, 2)}, p = {p}") 

Error in `mutate()`:
! Problem while computing `data = map(.data$data, .f, ...)`.
Caused by error in `t.test.default()`:
! not enough 'x' observations

With SUJ_PRE I'm getting the error with 'x' , but I've also got the 'y' message with another variable too. Any thoughts on that? I've seem some similar questions, but I couldn't solve my problem. Thanks in advance.
data:

> dput(data)
structure(list(ID = c("COPA_M1B", "COPA_M1C", "COPA_M2B", "COPA_M2C", 
"COPA_M3C", "COPA_M3A", "COPA_M3B", "COPA_H1A", "COPA_H1B", "COPA_H2A", 
"COPA_H2B", "COPA_H2C", "COPA_H3A", "COPA_H3B", "NI_M1B", "NI_M1C", 
"NI_M2A", "NI_M2B", "NI_M3C", "NI_M3A", "NI_M3B", "NI_H1A", "NI_H1B", 
"NI_H1C", "NI_H2B", "NI_H2C", "NI_H3A", "NI_H3C", "CACEM_M1A", 
"CACEM_M1B", "CACEM_M1C", "CACEM_M2B", "CACEM_M3B", "CACEM_H1B", 
"CACEM_H1C", "CACEM_H2A", "CACEM_H2C", "OEIRAS_M1B", "OEIRAS_M1C", 
"OEIRAS_M2B", "OEIRAS_M3B", "OEIRAS_M3C", "OEIRAS_H1B"), SUJ_PRE = c(25, 
40, 56, 49, 47, 38, 58, 38, 42, 71, 43, 46, 74, 43, 45, 35, 70, 
33, 45, 53, 50, 59, 62, 41, 35, 43, 40, 21, 23, 33, 35, 21, 36, 
15, 31, 19, 31, 20, 22, 20, 19, 21, 25), AMOSTRA = structure(c(1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("PB", "PE"
), class = "factor")), class = c("grouped_df", "tbl_df", "tbl", 
"data.frame"), row.names = c(NA, -43L), groups = structure(list(
    ID = c("CACEM_H1B", "CACEM_H1C", "CACEM_H2A", "CACEM_H2C", 
    "CACEM_M1A", "CACEM_M1B", "CACEM_M1C", "CACEM_M2B", "CACEM_M3B", 
    "COPA_H1A", "COPA_H1B", "COPA_H2A", "COPA_H2B", "COPA_H2C", 
    "COPA_H3A", "COPA_H3B", "COPA_M1B", "COPA_M1C", "COPA_M2B", 
    "COPA_M2C", "COPA_M3A", "COPA_M3B", "COPA_M3C", "NI_H1A", 
    "NI_H1B", "NI_H1C", "NI_H2B", "NI_H2C", "NI_H3A", "NI_H3C", 
    "NI_M1B", "NI_M1C", "NI_M2A", "NI_M2B", "NI_M3A", "NI_M3B", 
    "NI_M3C", "OEIRAS_H1B", "OEIRAS_M1B", "OEIRAS_M1C", "OEIRAS_M2B", 
    "OEIRAS_M3B", "OEIRAS_M3C"), .rows = structure(list(34L, 
        35L, 36L, 37L, 29L, 30L, 31L, 32L, 33L, 8L, 9L, 10L, 
        11L, 12L, 13L, 14L, 1L, 2L, 3L, 4L, 6L, 7L, 5L, 22L, 
        23L, 24L, 25L, 26L, 27L, 28L, 15L, 16L, 17L, 18L, 20L, 
        21L, 19L, 43L, 38L, 39L, 40L, 41L, 42L), ptype = integer(0), class = c("vctrs_list_of", 
    "vctrs_vctr", "list"))), row.names = c(NA, -43L), class = c("tbl_df", 
"tbl", "data.frame"), .drop = TRUE))

I think this is a minimal reproducible example: `rstatix::t_test(SUJ_PRE ~ AMOSTRA, data = data)` — Isaiah, Nov 01 '22 at 13:18

score 1 · Accepted Answer · answered Nov 01 '22 at 14:30

1

Your data frame is grouped by ID, and rstatisx::t_test will honor the groupings in a data frame. This means some groups only have a single member, which of course causes an error.

The solution is simply to ungroup your data frame when using it inside stat_pvalue_manual:

data %>%
  ggplot(aes(x = AMOSTRA, y = SUJ_PRE)) +
  stat_boxplot(geom = "errorbar",
               width = 0.15) +
  geom_boxplot(aes(fill = AMOSTRA), outlier.colour = "#9370DB", 
               outlier.shape = 19,
               outlier.size= 2, notch = T) +
  scale_fill_manual(values = c("PB" = "#E6E6FA", "PE" = "#CCCCFF"), 
                    label = c("PB" = "PB", "PE" = "PE"),
                    name = "Amostra:") +
  stat_summary(fun = mean, geom = "point", shape = 20, size= 5, 
               color = "#9370DB") + 
  stat_pvalue_manual(data %>% ungroup() %>%
                       t_test(SUJ_PRE ~ AMOSTRA) %>%
                       add_xy_position(),
                     label = "t = {round(statistic, 2)}, p = {p}")

answered Nov 01 '22 at 14:30

Allan Cameron

147,086
7
49
87

ohhhh, thank you! Now it's working just fine! It may be a VERY silly question, but how did you notice that? – Larissa Cury Nov 01 '22 at 16:08
1

@LarissaCury after noting @Isaiah's comment under your question, I thought the result was odd, and ran `rstatix::t_test` through the debugger using `debug(rstatix::t_test)`. I found that this threw an error only when it tried to run the function `rstatix:::tow_sample_test`, so I ran this through the debugger too. I noticed that this function was returning a result from inside a line that said `if (is_grouped_df(data))`. That made me realise that the data frame must be grouped, and sure enough, when I checked your sample data, it was. So no, not a silly question! – Allan Cameron Nov 01 '22 at 16:17
What a journey! Thanks for the feedback and for the effort :) – Larissa Cury Nov 01 '22 at 16:20
what strikes me most is that I was using the very same df with the ```t.test``` function and no problem there...I couldn't come up with a solution by myself – Larissa Cury Nov 01 '22 at 16:23

not enough 'x' or 'y' observations in ggplot2 for t-tests or Wilcoxon test

1 Answers1