0

I'm looking to add "n = #" under each of the variables on the x-axis but I'm not sure how. The counts don't necessarily have to be under the names, just as long as the counts are there. I'm also working with two categorical variables, so that may be the issue too. Let me know if you have any suggestions, I'm new to R.

~

Here's some information on the dataset and the variables I'm comparing. The overall data set (scorpions) consists of scorpion species and what vegetation they're found in. Those are the two things I'm comparing. "species" is the vector for the species and "veg" is the vector for the vegetation type. These are both character vectors. I really just want to know how to add more labels onto my graph to give more clarification. This is what my graph currently looks like: graph

I just want to be able to add number labels anywhere. If you want to recreate it, you can really use any dataset that consists of two character vectors. The other posts don't help because they consist of numerical vectors as well. If it's not possible to do this, then just let me know.

Thank you everyone for the help so far!

ggplot(data=scorpions, aes(x=species,y=veg,fill=veg)) +
  geom_bar(stat="identity",color="black",position=position_dodge()) +
  theme_stata() +
  scale_fill_economist() +
  theme(
    axis.text.y = element_text(angle = 0),
    axis.title  = element_text(face="bold"),
    axis.text.x = element_text(face = "italic")
  ) +
  labs(title="Relationship Between Species and Vegetation Type")  

I've tried changing the names in the Excel spreadsheet, but it looks really messy. I've also tried googling answers but nothing works since it's two categorical variables.

  • 1
    OP, can you post a dataset so that we can reproduce what you have created so far? Alternatively, you could find one of the built-in datasets (type `data()` to see them all) and we can work from that by example. – chemdork123 Dec 01 '22 at 02:00
  • Looking at the code you have, it's hard to understand what your dataset `scorpions` looks like. I think we can help you even if you can explain what each column in your dataset contains. You can post the dataframe here directly via `dput(scorpions)` and copy and paste that code block into your question. If it's too big or you prefer not sharing, even pasting the output of `str(scorpions)` could help. – chemdork123 Dec 01 '22 at 02:03
  • @mychemicalromance, I think the two dupe links do a good job demonstrating how to add labels to your `geom_bar`s. If it fails for some reason, or I've mis-read your question, please [edit] your question to add sample data as previously suggested, then @ping me and we'll work it out. – r2evans Dec 01 '22 at 03:12
  • @r2evans I added some more info let me know if that helps or if you need more info – mychemicalromance Dec 01 '22 at 23:41
  • @chemdork123 ^ i added some more info – mychemicalromance Dec 01 '22 at 23:41

1 Answers1

1

This question is in contrast to the most common dupe-links for grouped bar plots in ggplot2 in that other links (How to put labels over geom_bar for each bar in R with ggplot2 and How to put labels over geom_bar in R with ggplot2) tend to talk about one categorical variable only; this question asks about two categorical variables.

But it's not that hard: we just need to come up with a number for all combinations of each of the two categoricals. I'll use xtabs for that.

Using ggplot2::diamonds dataset, plotting against cut and color (both character):

library(ggplot2)
head(diamonds)
# # A tibble: 6 x 10
#   carat cut       color clarity depth table price     x     y     z
#   <dbl> <ord>     <ord> <ord>   <dbl> <dbl> <int> <dbl> <dbl> <dbl>
# 1  0.23 Ideal     E     SI2      61.5    55   326  3.95  3.98  2.43
# 2  0.21 Premium   E     SI1      59.8    61   326  3.89  3.84  2.31
# 3  0.23 Good      E     VS1      56.9    65   327  4.05  4.07  2.31
# 4  0.29 Premium   I     VS2      62.4    58   334  4.2   4.23  2.63
# 5  0.31 Good      J     SI2      63.3    58   335  4.34  4.35  2.75
# 6  0.24 Very Good J     VVS2     62.8    57   336  3.94  3.96  2.48

Starting with a simple (non-themed) bar plot:

gg <- ggplot(data=diamonds, aes(x=cut,y=color,fill=color)) + 
  geom_bar(stat="identity",color="black",position=position_dodge())
gg

ggplot2 barplot with two categoricals

Calculate the frequency table:

xtabs(~ cut + color, data = diamonds)
#            color
# cut            D    E    F    G    H    I    J
#   Fair       163  224  312  314  303  175  119
#   Good       662  933  909  871  702  522  307
#   Very Good 1513 2400 2164 2299 1824 1204  678
#   Premium   1603 2337 2331 2924 2360 1428  808
#   Ideal     2834 3903 3826 4884 3115 2093  896

### convert to a frame
tab <- data.frame(xtabs(~ cut + color, data = diamonds))
head(tab)
#         cut color Freq
# 1      Fair     D  163
# 2      Good     D  662
# 3 Very Good     D 1513
# 4   Premium     D 1603
# 5     Ideal     D 2834
# 6      Fair     E  224

New plot, adding geom_text:

gg + 
  geom_text(data = tab, aes(label = Freq), 
            position = position_dodge(width = 0.9), vjust = -0.25)

ggplot2, two categoricals, now with labels

r2evans
  • 141,215
  • 6
  • 77
  • 149
  • mychemicalromance, see my answer. The only thing that keeps this from being a perfect dupe of the previous dupe links is that this requires the `tab` data here. I hope this helps. – r2evans Dec 02 '22 at 00:48