1
ggplot() + 
  geom_histogram(data = df1, aes(x=meanf,fill = "g", color="g"), alpha = 0.6,binwidth = 0.02)+
  geom_histogram(data = df2, aes(x=meanf,fill = "b", color="b"), alpha = 0.4,binwidth = 0.02)+
  scale_colour_manual(name="N1", values=c("g" = "green", "b"="blue"), labels=c("b"="1", "g"="2")) +
  scale_fill_manual(name="N2", values=c("g" = "green", "b"="blue"), labels=c("b"="1", "g"="2"))+
  theme_bw()+
  ggsave('temp.jpg')

I am getting the plot with histogram counts, but I want to scale it between 0 and 1. From this question Normalizing y-axis in histograms in R ggplot to proportion , I understand how to do it for one dataframe, but what if I'm using two dataframes, as in my question?

enter image description here

Community
  • 1
  • 1
maximusdooku
  • 5,242
  • 10
  • 54
  • 94

2 Answers2

0

Scale the data first! Just create a temporary dataset that has its y-axis scaled between 0 and a certain number--divide everything by that certain number (if you want the top-most to be 1.00, set the divisor to the top number).

CinchBlue
  • 6,046
  • 1
  • 27
  • 58
  • No need for a temporary dataset, you can provide an equation in the plot’s aesthetics. – Konrad Rudolph Jul 27 '15 at 20:54
  • @maximusdooku Apologies, I hadn’t seen the restriction of using two datasets - `..count.. / max(..count..)` only seems to work for a single one, which makes sense. – Konrad Rudolph Jul 27 '15 at 21:15
0
geom_histogram(data = df1, aes(y = ..ncount..,x=meanf,fill = "g", color="g"))

should do it.

If you want both histograms be normalized by the same divisor:

First get the y-range of the original histogram first. Refer here

ggobj <- ggplot() + 
  geom_histogram(data = df1, aes(x=meanf,fill = "g", color="g"), alpha = 0.6,binwidth = 0.02)+
  geom_histogram(data = df2, aes(x=meanf,fill = "b", color="b"), alpha = 0.4,binwidth = 0.02)

y_max <- ggplot_build(ggobj)$panel$ranges[[1]]$y.range[2] 

Then recreate your histogram and scale it with the y_range that you got.

p <- ggplot() + 
      geom_histogram(data = df1, aes(y_max=y_max, y=..count../y_max,x=meanf,fill = "g", color="g"), alpha = 0.6,binwidth = 0.02)+
      geom_histogram(data = df2, aes(y_max=y_max, y=..count../y_max,x=meanf,fill = "b", color="b"), alpha = 0.4,binwidth = 0.02)
Community
  • 1
  • 1
Onur
  • 317
  • 1
  • 2
  • 13
  • When I divide by y_range, I am getting this error even though y_range exists: Error in eval(expr, envir, enclos) : object 'y_range' not found – maximusdooku Jul 27 '15 at 22:34
  • Alright, you can't use an environment variable with the special ggplot variables like `..count..`, so you need to redefine it in aes. I fixed the code, it should work now. – Onur Jul 28 '15 at 13:32