1

Below is the snapshot of my dataset

   Town       Age_Group Race       Count_Type Total_Count
   <chr>      <chr>     <chr>      <chr>            <dbl>
 1 Milwaukee  12-17     White      Initial            500
 2 Milwaukee  12-17     White      Full               424
 3 Milwaukee  12-17     Black      Initial           1080
 4 Milwaukee  12-17     Black      Full               771
 5 Milwaukee  12-17     AmerIndian Initial             11
 6 Milwaukee  12-17     AmerIndian Full                 5

Code for the plot, I should also mention that ggplot2 is a hard requirement

# Visualization
ggplot(data = milwaukee, aes(x = Age_Group, y = Total_Count, fill = Race)) +
  geom_bar(stat = 'identity', position = 'stack') +
  labs(x = 'Age Group', y = 'Total Vaccinated by Age Group',
       title = 'Milwaukee Total Vaccinated by Age Group & Race') + 
  # scale_y_continuous(trans = 'log2') +
  geom_text(aes(label = scales::number(Total_Count, big.mark = ',', accuracy = 1)), size = 2, 
            position = position_stack(vjust = 0.5)) + 
  theme_classic() + 
  theme(text = element_text(size = 9, family = 'mono'), 
        legend.position = 'bottom',
        plot.title = element_text(hjust = 0.5, size = 11))

Sample data

> dput(milwaukee)
structure(list(Town = c("Milwaukee", "Milwaukee", "Milwaukee", 
"Milwaukee", "Milwaukee", "Milwaukee", "Milwaukee", "Milwaukee", 
"Milwaukee", "Milwaukee", "Milwaukee", "Milwaukee", "Milwaukee", 
"Milwaukee", "Milwaukee", "Milwaukee", "Milwaukee", "Milwaukee", 
"Milwaukee", "Milwaukee", "Milwaukee", "Milwaukee", "Milwaukee", 
"Milwaukee", "Milwaukee", "Milwaukee", "Milwaukee", "Milwaukee", 
"Milwaukee", "Milwaukee", "Milwaukee", "Milwaukee", "Milwaukee", 
"Milwaukee", "Milwaukee", "Milwaukee", "Milwaukee", "Milwaukee", 
"Milwaukee", "Milwaukee"), Age_Group = c("12-17", "12-17", "12-17", 
"12-17", "12-17", "12-17", "12-17", "12-17", "18-24", "18-24", 
"18-24", "18-24", "18-24", "18-24", "18-24", "18-24", "25-44", 
"25-44", "25-44", "25-44", "25-44", "25-44", "25-44", "25-44", 
"45-64", "45-64", "45-64", "45-64", "45-64", "45-64", "45-64", 
"45-64", "65+", "65+", "65+", "65+", "65+", "65+", "65+", "65+"
), Race = c("White", "Black", "AmerIndian", "Asian", "Hispanic", 
"MultipleRaces", "Other", "Unknown", "White", "Black", "AmerIndian", 
"Asian", "Hispanic", "MultipleRaces", "Other", "Unknown", "White", 
"Black", "AmerIndian", "Asian", "Hispanic", "MultipleRaces", 
"Other", "Unknown", "White", "Black", "AmerIndian", "Asian", 
"Hispanic", "MultipleRaces", "Other", "Unknown", "White", "Black", 
"AmerIndian", "Asian", "Hispanic", "MultipleRaces", "Other", 
"Unknown"), Count_Type = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L), .Label = c("Initial", "Full"), class = "factor"), Total_Count = c(500, 
1080, 11, 172, 2404, 105, 135, 272, 1012, 1610, 10, 326, 3051, 
110, 502, 480, 3281, 4185, 34, 738, 10023, 147, 2060, 1907, 4453, 
6361, 41, 695, 9250, 144, 2549, 2043, 4000, 3520, 22, 368, 3554, 
83, 1182, 1354)), row.names = c(NA, -40L), class = c("tbl_df", 
"tbl", "data.frame"))

And below is my messy plot. What can I add or change in order to have values not overlap? Different chart ideas are also welcome

enter image description here

user3813620
  • 352
  • 2
  • 8
  • 2
    Please provide a [reproducible minimal example](https://stackoverflow.com/q/5963269/8107362). Especially, provide your sample data in a ready-to-copy format, e.g. with `dput()`. – mnist Aug 23 '21 at 21:29
  • 3
    In `geom_text()`, you can set `check_overlap = TRUE` to censor overlapping values. – teunbrand Aug 23 '21 at 21:45
  • @mnist thanks for suggestion. I just edited the question to provide sample data – user3813620 Aug 23 '21 at 22:22
  • @teunbrand thanks for the tip, but I would like to be able to show all the labels if at all possible. I just provided sample data above – user3813620 Aug 23 '21 at 22:23
  • As a matter of aesthetics, you might also consider sorting the Race values in order of frequency instead of alphabetical. Or, depending on what the takeaway message is, you might want to normalize these by population, since it's hard to know if a given number is high or low without knowing how many people there are in each age/race category. – Jon Spring Aug 23 '21 at 22:40

2 Answers2

1

You might try ggrepel, but it could take some fiddling to get what you want, given the 2 orders of magnitude of data range. I used the direction = "y" parameter to specify the labels should only be shifted up and down (to be tidier), but you might prefer giving the labels the ability to move side-to-side (direction = "x") or in any direction (omit the direction parameter or set to "both").

...
  ggrepel::geom_text_repel(aes(label = scales::number(Total_Count, big.mark = ',', accuracy = 1)), size = 2, 
            position = position_stack(vjust = 0.5), direction = "y", 
            box.padding = unit(0.01, "lines")) + 
...

enter image description here

...or, same with direction = "x", segment.color = NA,:

enter image description here

Jon Spring
  • 55,165
  • 4
  • 35
  • 53
  • direction = 'y' worked, but I had no luck 'x'. Not sure what I did wrong. I'm probably going to stick with that or edit the visuals manually in an SVG editor. Thank you – user3813620 Aug 24 '21 at 03:02
1

Given the data, there is probably no ideal solution to this problem. Too many groups are just too small to be shown in the same bar with labels within/on top of each other.

In general, {ggfittext} does exactly what you are looking for, yet it can not perform miracles:

ggplot(data = milwaukee, aes(x = Age_Group, y = Total_Count, fill = Race)) +
  geom_bar(stat = 'identity', position = 'stack') +
  labs(x = 'Age Group', y = 'Total Vaccinated by Age Group',
       title = 'Milwaukee Total Vaccinated by Age Group & Race') + 
  ggfittext::geom_bar_text(position = "stack", reflow = TRUE, outside = TRUE) +
  theme_classic() + 
  theme(text = element_text(size = 9, family = 'mono'), 
        legend.position = 'bottom',
        plot.title = element_text(hjust = 0.5, size = 11))

I'd suggest to either combine some groups, use a relative presentation, or adjust the missing labels outside of ggplot.

enter image description here

mnist
  • 6,571
  • 1
  • 18
  • 41
  • I found a potential solution [here](https://stackoverflow.com/questions/24626769/alternate-geom-text-position-with-hjust). However, I'm having difficulties adjusting it to my dataset. Specifically I don't understand what got fed into `y` inside the second call to `aes()` function. Any help would be highly appreciated – user3813620 Aug 29 '21 at 18:08