12

My questions is similar to Normalizing y-axis in histograms in R ggplot to proportion but I'd like to add to it a bit.

In general, I have 6 histograms in a 2x3 facet design, and I'd like to normalize each of them separately. I'll try to make a sample data set here to give an idea:

hvalues=c(3,1,3,2,2,5,1,1,12,1,4,3)
season=c("fall","fall","fall","fall","winter","winter","winter","winter","summer","summer","summer","summer")
year=c("year 1","year 1","year 2","year 2","year 1","year 1","year 2","year 2","year 1","year 1","year 2","year 2")
group=c("fall year 1","fall year 1","fall year 2","fall year 2","winter year 1","winter year 1","winter year 2","winter year 2","summer year 1","summer year 1","summer year 2","summer year 2")
all=data.frame(hvalues,season,year)

Using

ggplot(all, aes(x=hvalues,group=group)) + 
geom_histogram(aes(y=..count../sum(..count..))) + 
facet_grid(season ~ year)

gives the proportions overall (i.e. combining all the facets). I'd like each group facet to be normalized to 1. hvalues are not integers in my actual data - they are numerical.

I am a novice using R, and would really appreciate some help. Thanks in advance!

Community
  • 1
  • 1
user1195564
  • 309
  • 1
  • 4
  • 11
  • 2
    Try `y = ..density..`. – joran May 02 '13 at 13:55
  • 1
    `all` has to be a dataframe. Try `all <- as.data.frame(cbind(hvalues,season,year))`. – Jonas Tundo May 02 '13 at 13:57
  • 1
    @JT85 I agree, but please don't encourage the use of `as.data.frame(cbind(...))` in place of `data.frame(...)`. – joran May 02 '13 at 14:03
  • @joran I have played around with y=..density.. , but I don't think this is really what I want to convey in the figure. I am interested in proportions of home ranges falling into different size categories, within each season, and within each year. Also, running ..density.. gives y values up to 3.0, which I don't understand. – user1195564 May 02 '13 at 14:09
  • @JT85 I have edited the question to include your data.frame comment...I was quickly trying to come up with a dataset and automatically went to cbind! – user1195564 May 02 '13 at 14:11
  • The usual last resort is always to calculate the proportions yourself outside of ggplot. – joran May 02 '13 at 14:12
  • And in any case, if you're plotting proportions within distinct categories, `geom_bar` with pre-computed proportions would probably be more appropriate anyway. – joran May 02 '13 at 14:19
  • @joran could you give an example of using geom_bar? – user1195564 May 02 '13 at 14:23
  • @joran - really sorry for the confusion! I should have specified that hvalues was numerical – user1195564 May 02 '13 at 15:35
  • 2
    Check out `?stat_bin` and try the options there. I think maybe `..ncount..` is what you're looking for. – joran May 02 '13 at 15:36
  • @joran this is almost it. but ..ncount.. seems to scale to one in each bin of the histogram – user1195564 May 02 '13 at 15:42
  • 1
    Sigh. Exactly. Scaling to 1 in each facet is precisely what you said you want. I'm moving on now. – joran May 02 '13 at 15:44
  • yes, i wanted ppns to add to 1 within each facet. ..ncount.. is showing ppns that add to >1 in each facet. maybe i'm missing something here. sorry to have bothered you, thank you for your help. – user1195564 May 02 '13 at 15:58

1 Answers1

12

The solution is:

ggplot(all, aes(x=hvalues)) +
    facet_grid(season ~ year,drop=T) +
    geom_histogram(aes(y=(..count..)/tapply(..count..,..PANEL..,sum)[..PANEL..]))

I stole this from this question

I feel your question might be a duplicate of that one by the way.

Community
  • 1
  • 1
stacksia
  • 631
  • 6
  • 10
  • Does anyone know what docs describe `..PANEL..`? I see count etc listed in `?stat_bin` under "Computed Variables" section, but I don't know how to even start searching for help on PANEL. Also, this is neat. – helmingstay Nov 08 '22 at 06:51