0

I want to colour the background of a histogram produced using ggplot2. I want the background to look like the one in the answer here.

This is my code:

dates <- seq(from = as.Date("2015/1/1"), to = as.Date("2015/12/31"), "day")

library(lubridate)
day <- yday(dates)
month <- month(dates)

df <- data.frame(day, month)

library(dplyr)
df %>%
sample_n(50) ->
df

library(ggplot2)
ggplot(df, aes(day)) + geom_histogram() + 
    scale_x_continuous(breaks = seq(0, 365, 10), limits = c(0, 365)) + 
    theme_bw()

Which produces this plot:

enter image description here

And this is what I tried, which doesn't work:

ggplot(df, aes(day)) + geom_histogram() + 
    geom_rect(xmin = day, xmax = day, ymin = -Inf, ymax = Inf, fill = month) + 
    scale_x_continuous(breaks = seq(0, 365, 10), limits = c(0, 365)) + 
    theme_bw()
Community
  • 1
  • 1
luciano
  • 13,158
  • 36
  • 90
  • 130
  • The answer is pretty clear what you need to do. What part of it is difficult to understand? – Roman Luštrik Jul 20 '15 at 09:09
  • Have you looked at the other answers to your question? http://stackoverflow.com/questions/31510796/shade-background-of-ggplot-according-to-month/31511456#31511456 – Heroka Jul 20 '15 at 09:15
  • @RomanLuštrik if you look at `rects` in the linked answer, the `xstart` and `xend` arguments of `geom_rect` each have their own variables. But in `df`, that I have created, I only have a single variable, `day`, which is converted to bins by `geom_histogram`, so it's unclear what the values should be for `xmin` and `xmax`. – luciano Jul 20 '15 at 09:29
  • 1
    Coloring of the background is done in a second layer that can (or not) be connected to the original dataset used. What you're telling ggplot is that you want to plot something from day to day on x axis and from -Inf to Inf on y axis. If you think about it, a region from day to day is of width 0. Values passed to xmin and xmax must be such that they produce a positive difference. – Roman Luštrik Jul 20 '15 at 10:28

1 Answers1

3

You try to plot the rectangles from the sampled data, wich won't work, because data is missing. To draw the rectangles, you need to specify the start and end days of each month and this is best achieved by creating an extra data set for this purpose.

This data frame, I create as follows:

library(dplyr)
month_df <- df %>%
            group_by(month) %>%
            summarize(start=min(day),end=max(day) + 1) %>%
            mutate(month=as.factor(month))
# correct the last day of the year
month_df[12,"end"] <- month_df[12,"end"] - 1

It's important that you do this before you replace df by the 50 samples. The last line is somewhat unpleasant: in order to avoid gaps between the rectangles, I add one to the last day of the month. This should not be done for the very last day. It works, but maybe you find a neater solution...

The first few lines of month_df should be

   month start end
1      1     1  32
2      2    32  60
3      3    60  91

Now, the plot can be created by

ggplot(df) + 
  geom_rect(data=month_df,aes(xmin = start, xmax = end, fill = month),
            ymin = -Inf, ymax = Inf) + 
  geom_histogram(aes(day)) + 
  scale_x_continuous(breaks = seq(0, 365, 10), limits = c(0, 365)) + 
  theme_bw()

A few remarks: * It is important that geom_rect() comes before geom_histogram() in order to have the rectangles in the background. * I removed aes(day) from the ggplot() and into geom_histogram() because it is used only there. Otherwise, it will confuse geom_rect() and you will get an error. * ymin=-Inf and ymax=Inf are not aestetic mappings from the data because they are actually set to constants. So there is no need to have these inside aes(). Nothing bad will happen, if you keep them inside aes(), though.

The plot I get is the following:

enter image description here

Stibu
  • 15,166
  • 6
  • 57
  • 71