0

I have dataframe which have column age, gender (Male/Female). I want to plot grouped bar plot by Age and want to append line plot of ratio of male to female of each age.

test is dataframe with age, gender as column

ratio_df is new data frame store ratio of male to female in each age

ratio_df <- ddply(test, 'age', function(x) c('ratio' = sum(test$gender == 'Male') / sum(test$gender == 'Female'))) 

ggplot with barplot and ratio line in ggplot

ggplot(data = test, aes(x = factor(age), fill = gender)) + geom_bar() + geom_line(data = ratio_df, aes(x = age, y = ratio))
  • 3
    Please give a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). – ziggystar Jan 23 '14 at 09:11
  • 2
    Your ddply call seems off to me - I think it always yields the same ratio (over the whole dataframe). – CMichael Jan 23 '14 at 12:14
  • I want barplot with test data frame and line plot appended of ratio_df data frame. – Shishir Bhattarai Jan 23 '14 at 14:28
  • 1
    When you create a stacked barplot, where `gender` is represented by different `fill` colours, the ratio of male to female is inherently visible for each stacked bar. – Sven Hohenstein Jan 23 '14 at 14:38

1 Answers1

0

As mentioned above, your ddply call seems off to me - I think it always yields the same ratio (over the whole dataframe). I could not figure out a compact elegant one from the top of my head so I had to resort to a somewhat clunky one but it does work.

EDIT: I changed the code to reflect the workaround described by http://rwiki.sciviews.org/doku.php?id=tips:graphics-ggplot2:aligntwoplots to adress the OP's comment.

#sample data
test=data.frame(gender=c("m","m","f","m","f","f","f"),age=c(1,3,4,4,3,4,4))

require(plyr)
age_N <- ddply(test, c("age","gender"), summarise, N=length(gender))

require(reshape2)
ratio_df <- dcast(age_N, age ~ gender, value.var="N", fill=0)
ratio_df$ratio <- ratio_df$m / (ratio_df$f+ratio_df$m)

#create variables for facetting
test$panel = rep("Distribution",length(test$gender))
ratio_df$panel = rep("Ratio",length(ratio_df$ratio))

test$panel <- factor(test$panel,levels=c("Ratio","Distribution"))

require(ggplot2)
g <- ggplot(data = test, aes(x = factor(age)))
g <- g + facet_wrap(~panel,scale="free",ncol=1)
g <- g + geom_line(data = ratio_df, aes(x = factor(age), y = ratio, group=1))
g <- g + geom_bar(aes(fill=gender))
print(g)

Is this what you are looking for? However, I think @SvenHohenstein is right that the line does not any information as the split is evident form the fill.

enter image description here

CMichael
  • 1,856
  • 16
  • 20