0

I've written a function that produces a histogram two vertical bars indicating a range of values. I'd like to modify this function so that the bars within the specified range are a different color.

Heres' my function and a quick demonstration:

require(ggplot2)
niceHist <- function(data, cutpoint1, cutpoint2, title = "Supply a title, genius") {
  temp_dat = data.frame(Data = data, Col = 0)
  temp_dat = temp_dat[! is.na(temp_dat$Data),]
  temp_dat[temp_dat$Data >= cutpoint1 & temp_dat$Data <= cutpoint2,]$Col = 1
  my_hist = qplot(data) +
    geom_histogram(fill = "forestgreen") +
    geom_vline(xintercept = cutpoint1) +
    geom_vline(xintercept = cutpoint2) +
    ggtitle(paste(title)) +
    theme_minimal() +
    theme(text = element_text(size = 16), axis.line.y = element_line(color = "black", size = 0.5), axis.line.x = element_line(color = "black", size = 0.5))
  my_hist
}
u = rnorm(100)
c1 = mean(u) - sd(u)
c2 = mean(u) + sd(u)
niceHist(u, c1, c2)

I've seen a similar question whose accepted solution is not well-suited to my needs because I want to maintain the shape of the original histogram. I'd also prefer not to change the number of bins and, if at all possible, apply the color difference such that a single bar in the histogram can be two colors if the vertical black lines happen to bisect it.

*My main goal is to clearly display how much of the distribution is captured within the supplied range without changing the shape of the distribution. * So an alternative but less desirable solution would be to simply color the background with respect to the range provided. Also, I need my function to return a ggplot object because it will occasionally be further modified using ggplot syntax.

UPDATE: At the suggestion of a comment, I have tried using scale_fill_gradientn but this doesn't work:

niceHist <- function(data, cutpoint1, cutpoint2, title = "Supply a title, genius") {
      temp_dat = data.frame(Data = data, Col = 0)
      temp_dat = temp_dat[! is.na(temp_dat$Data),]
      temp_dat[temp_dat$Data >= cutpoint1 & temp_dat$Data <= cutpoint2,]$Col = 1
      my_hist = qplot(data) +
        geom_histogram() +
        scale_fill_gradientn(colours = c("blue", "red", "red", "blue"), values = c(min(data, na.rm = TRUE), cutpoint1, cutpoint2, max(data, na.rm = TRUE))) +
        geom_vline(xintercept = cutpoint1) +
        geom_vline(xintercept = cutpoint2) +
        ggtitle(paste(title)) +
        theme_minimal() +
        theme(text = element_text(size = 16), axis.line.y = element_line(color = "black", size = 0.5), axis.line.x = element_line(color = "black", size = 0.5))
      my_hist
    }
Community
  • 1
  • 1
Slavatron
  • 2,278
  • 5
  • 29
  • 40

1 Answers1

0

My solution is to create the count data.frame and add a factor to indicate the region of bin. for example:

df <- ggplot_build(niceHist(u,c1,c2))$data[[1]] #recreate the count df
require(dplyr)
df <- df %>% mutate(col=cut(x,c(min(x)-0.001,c1,c2,max(x)+0.001))) 
ggplot(df,aes(x,count))+ geom_col(aes(fill=col)) + geom_vline(xintercept = c(c1,c2)) 
Zhang Cheng
  • 351
  • 3
  • 5