1

What I have is a dataset with groups within different groups like this below

    Year    Status     Group  N
    1992    Pre        Blue   7
    1993    Pre        Blue   2
    1995    Pre        Blue   11
    2002    Pre        Blue   10
    2003    Pre        Blue   8
    2006    Post       Green  7
    2007    Post       Green  14
    2008    Post       Green  13
    2009    Post       Green  9
    2010    Post       Green  7
    2011    Pre        Blue   3
    2011    Post       Green  2
    2012    Pre        Blue   2
    2012    Post       Green  4
    2013    Pre        Blue   5
    2013    Post       Green  2
    2014    Pre        Blue   4
    2014    Post       Green  10
    2015    Current    Blue   12
    2015    Post       Green  8
    2016    Current    Blue   3
    2016    Post       Green  5
    2017    Current    Blue   13
    2017    Post       Green  6
    2018    Pre        Blue   2
    2018    Post       Green  7
    2019    Pre        Blue   9
    2019    Post       Green  7

What I am trying to do is plot the count(column N) on Y axis, Year on x axis by Group.

  • histogram from group Blue to be colored Blue,
  • histogram from group Green to be colored
    • Grey when Status=Pre,
    • Green When Status=Current.
    • Yellow when Status = Post
  • Add a label showing count on top of each bar of the histogram.

Something like this below [please note the colors in this example below does not match the additional requirements mentioned above].

I am able to plot this using geom_bar but not sure how to plot this various counts by group as a histogram any help is much appreciated.

ggplot(df, aes(Year, N)) +
  geom_bar(aes(fill=Group), position="dodge", stat="identity", width=.5)+
  geom_text(aes(label=N, group=Group), position=position_dodge(width=0.5), vadjust=-0.5)+
theme_bw(base_size=18)+
ylab('Bedbugs') + xlab('Year')

enter image description here

bison2178
  • 747
  • 1
  • 8
  • 22

2 Answers2

3

Using your data (lets call it df) and color definitions, it seems there are only blue and yellow, but presumably you have more data than this You can approach it something like this:

df <- df %>%
  mutate(color=case_when(
    Group=="Blue"~"Blue",
    Group=="Green" & Status=="Pre" ~ "Green/Pre",
    Group=="Green" & Status=="Current"~ "Green/Curent",
    Group=="Green" & Status=="Post"~ "Yellow"))

colors = c("Blue"="blue","Green/Pre" = "grey","Green/Current" = "green","Yellow" = "yellow")

ggplot(df, aes(Year, N, color=color, fill=color)) +
  geom_bar(position="dodge", stat="identity", width=.5)+
  geom_text(aes(label=N,group=Group), position=position_dodge(width=0.5), vjust=-0.5, color="black")+
  scale_color_manual(values=colors)+
  scale_fill_manual(values=colors)+
  theme_bw(base_size=18,)+
  ylab('Count') + xlab('BedBugs')
langtang
  • 22,248
  • 1
  • 12
  • 27
  • 1
    @but this is not geom_histogram, this still looks like my old geom_bar – bison2178 Feb 18 '22 at 23:12
  • 1
    I would have solved the problem the same way. Not sure why you got downvoted. The code runs, and it seems to hit all the requirements. It does not help that OP's example data isn't easy to import into R, and the example data doesn't even have all the combinations (e.g. Current+Green). Also, here is the general difference between geom_bar and geom_hist https://stackoverflow.com/questions/14138247/ggplot-geom-bar-vs-geom-histogram – OTA Feb 18 '22 at 23:15
  • @OTA, the bars are not equal width. During years where there are no two groups, just one, the bar looks fat. During the years 2011-2011, the bar width looks equal. – bison2178 Feb 18 '22 at 23:20
1

I think the main problem is that the histograms are overlaying. With position="identity and bins=10 you get this:

library(ggplot2)

df %>% 
  ggplot(aes(x = N, fill = Status)) +
  geom_histogram(position = "identity", alpha = 0.3, bins = 10) +
  theme_bw()

enter image description here

TarJae
  • 72,363
  • 6
  • 19
  • 66