1

I'm trying to plot Female and Male data for each year in a facet wrap plot. As an example, for the year 2013 there are 10,949 data points for female and 53,351 data points for male. Here's a sample of the data:

 cost gender year
1 305.665 Female 2013
2 194.380 Female 2013
3 462.490 Female 2013
4 200.430 Female 2013
5 188.570 Female 2013
6 277.245 Female 2013

The code I put together is:

library(ggplot2)
costs<-read.table("cost_data.txt",header=TRUE)
df<-data.frame(costs)
ggplot(df, aes(df$cost,color=df$gender)) + 
geom_histogram(breaks=seq(0,3000,by=20),alpha=0.2) + facet_wrap(~year)+
labs(x="Costs",y="Number of Members")

Which produces the following chart:

enter image description here

Now if I just plotted the 2013 histograms in Excel with a binwidth of 20, the female plot would peak at 300 counts and the male would peak at 1800 counts. So what I've plotted in the chart doesn't make sense to me. It shows the female higher than the male and I'm not sure why the legend (or the histograms) aren't solid.

Just need a little guidance.

Angus
  • 355
  • 2
  • 12
  • 1
    It'd be helpful if you could post some reproducible code. E.g. `dput(df)` and upload the structured output. But what looks like is happening is that the female values are added on top of the male values. This is a default for `ggplot2`. Try `geom_histogram(breaks=seq(0,3000,by=20),alpha=0.2, position = "dodge")` – TheSciGuy Apr 25 '19 at 14:43
  • 2
    Possible duplicate of [multiple histograms with ggplot2 - position](https://stackoverflow.com/questions/8901330/multiple-histograms-with-ggplot2-position) – TheSciGuy Apr 25 '19 at 14:46
  • Thanks. That did work, but the legend still shows no fill. – Angus Apr 25 '19 at 14:49
  • 1
    Use fill `ggplot(df, aes(df$cost,fill=df$gender))` – TheSciGuy Apr 25 '19 at 14:52
  • 2
    Also remove the reference of `df$` inside `aes()`. – Parfait Apr 25 '19 at 14:53
  • 3
    @NickDylla using `$` inside an `aes` call is bad advice. See the [r-faq question on it](https://stackoverflow.com/questions/32543340/issue-when-passing-variable-with-dollar-sign-notation-to-aes-in-combinatio) – camille Apr 25 '19 at 14:57
  • 1
    You don't have a legend for the fill because you haven't mapped anything to fill, you've mapped to color – camille Apr 25 '19 at 14:57
  • Nice catch @Camille, had it correct in my answer, just slipped past me ;) – TheSciGuy Apr 25 '19 at 14:58

1 Answers1

1

For those who don't read the comments...

# To show bars side-by-side
geom_histogram(breaks=seq(0,3000,by=20),alpha=0.2, position = "dodge")

# To have filled bars and legend keys
ggplot(df, aes(cost,fill=gender))

# In completion
library(ggplot2)
costs<-read.table("cost_data.txt",header=TRUE)
df<-data.frame(costs)
ggplot(df, aes(cost,fill=gender)) + 
geom_histogram(breaks=seq(0,3000,by=20),alpha=0.2, position="dodge") + facet_wrap(~year)+
labs(x="Costs",y="Number of Members")
TheSciGuy
  • 1,154
  • 11
  • 22