0

I would like to add a star on top of a barplot to account for statistical significance.

I'm using the script below. However, I keep having error messages although I used the same exact code from another post:


> gg <- ggplot(aes(x=category, y=mean, fill=split, group=split), data=data)
> gg <- gg + geom_bar(stat='identity', position = position_dodge(), width=.5)
> gg <- gg + geom_errorbar(aes(ymin=mean-sd, ymax=mean+sd), position = position_dodge(width=.5), width=.2)
> gg <- gg +  scale_x_discrete(labels=c("Accuracy", "Precision", "Recall"))
> gg <- gg + xlab("Precision metrics") + ylab("Mean") + labs (fill="Classifier") + scale_fill_discrete(labels = c("k-NN", "Decision trees"))
> gg <- gg + theme(legend.position = "none") 
> 
> 
> label.df <- data.frame(Group = c("Accuracy"),
+                        Value = c(0.99))
> 
> gg + geom_text(data = label.df, label = "**")
Don't know how to automatically pick scale for object of type function. Defaulting to continuous.
Error in FUN(X[[i]], ...) : object 'category' not found

Here's the plot. I would like to add the star on top of the Accuracy red bar. the plot

Any input is appreciated!

PS: I'm providing a dput() sample below:

> dput(data)
structure(list(mean = c(0.9685, 0.925333333333333, 0.985666666666667, 
0.926833333333333, 0.968666666666667, 0.931333333333333), sd = c(0.0150831031289984, 
0.0301838809079725, 0.013306639946533, 0.0589488478824367, 0.0147873820085459, 
0.0712338870669478), category = structure(c(1L, 1L, 2L, 2L, 3L, 
3L), .Label = c("1", "2", "3"), class = "factor"), split = structure(c(1L, 
2L, 1L, 2L, 1L, 2L), .Label = c("1", "2"), class = "factor")), row.names = c("a", 
"c", "e", "g", "i", "k"), class = "data.frame")
juansalix
  • 503
  • 1
  • 8
  • 21
  • 2
    I think your variable names in `label.df` do not match the ones in `data`. You need the x and y variable names to match. Instead of `Group` it should be `category` and instead of `Value` it should be `mean`. – qdread May 06 '19 at 18:13
  • @qdread Thank you for your message. I'm still getting an Error: ```> label.df <- data.frame(category = c("Accuracy"), + mean = c(0.99)) > gg + geom_text(data = label.df, label = "**") Don't know how to automatically pick scale for object of type function. Defaulting to continuous. Error: All columns in a tibble must be 1d or 2d objects: * Column `fill` is function * Column `group` is function Call `rlang::last_error()` to see a backtrace``` – juansalix May 06 '19 at 18:17
  • 2
    Keep in mind that without a [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example), we're just guessing what the issue with your data is. You say you used the exact same code as another post...did that other post have the exact same column names and data types as yours? – camille May 06 '19 at 18:21
  • @camille I just posted a `dput()` sample of my data if that helps! – juansalix May 06 '19 at 21:22
  • The arguments to `aes` in your initial `ggplot` call trickle down to the geoms you add on. The data you use in your last `geom_text` doesn't have the variable `category`, hence the error that `column` can't be found – camille May 06 '19 at 21:34
  • @camille Thank you for your reply. I used `gg + geom_text(data = label.df, label = "**", category= c("Accuracy"))` for my las call instead, and got: ```Warning: Ignoring unknown parameters: category Don't know how to automatically pick scale for object of type function. Defaulting to continuous. Error: All columns in a tibble must be 1d or 2d objects: * Column `fill` is function * Column `group` is function```. I can't understand what is going on? – juansalix May 06 '19 at 21:46
  • You can't just add column names in a geom, though. `ggplot` will be looking for columns with the name `category` in any data frame used here. You can set `inherit.aes` to false, or just add `category` to the `aes` of geoms that deal with that column – camille May 06 '19 at 21:51

1 Answers1

0

You can try annotate. Furthermore it is recommended to use geom_col instead of geom_bar

 ggplot(df, aes(x=category, y=mean, fill=split)) +
   geom_col(position = position_dodge(width = 0.9)) + 
   geom_errorbar(aes(ymin=mean-sd, ymax=mean+sd), position = position_dodge(width=.9), width = .2) +
   scale_x_discrete(labels=c("Accuracy", "Precision", "Recall")) +
   theme(legend.position = "none") +
   annotate("text", x = 1, y = 1, label = "**") 

enter image description here

Roman
  • 17,008
  • 3
  • 36
  • 49