2

I have some aggregate data that I would like to visualize using ggplot2 and I am now looking for a smart shortcut to plot aggregate data?

The data I have looks like this, (I currently only have data like df2)

df <- data.frame(cyl = mtcars[,2])
(df2 <- t(table(df$cyl)))
      4  6  8
[1,] 11  7 14

and I am interested to create a plot like this, (see below for code used to create this plot)

geom_bar on aggregate data

require(ggplot2)
df <- data.frame(cyl = mtcars[,2])
df$cyl2  <- ifelse(df$cyl > 4, c('Treatment'), c('Control'))
# hline.data <- data.frame(cyl2 = c('Treatment', 'Control'), cyl = c(10, 2))

c <- ggplot(df, aes(factor(cyl)))
c + geom_bar(fill="white", colour="darkgreen", alpha=0.5) + facet_grid(. ~ cyl2, scales = "free", space = "free") + theme(axis.title.x  = element_blank(), axis.title.y  = element_blank(), axis.ticks.x = element_blank(), legend.position = "none") + scale_y_continuous(breaks=c(4, 6, 8), labels=c("Minimal","Mild", "Moderate")) + scale_x_discrete(breaks=c(6,8,4), labels=c("Treat 1", "Treat 2", "Control")) 

Bonus question for people who make it this far. Is it possible to make the breaks= in the scale_y_continuous different in the two facets? Or would it be a better strategy to add an geom_hline and add some text in the margins? It seems as it is quite tricky to add in margins, I would have to override the clipping like this or and this?

Community
  • 1
  • 1
Eric Fail
  • 8,191
  • 8
  • 72
  • 128
  • I'm not sure I understand what your main question is: what more did you want do with the plot you've got? – Marius Jan 30 '13 at 05:17
  • @Marius, Thank you for pointing out the obscurity. My main question is if the is an way to plot based on aggregate data. I only have the aggregate data shown in `df2`. – Eric Fail Jan 30 '13 at 05:18

1 Answers1

1

You basically just need to get your aggregate data into a dataframe, then add in the group labels:

df <- data.frame(cyl = mtcars[,2])
(df2 <- t(table(df$cyl)))
# Assume you know the group labels
df2 <- data.frame(val=df2[1, ], label=c("Control", "Treat1", "Treat2"))
df2$cyl2 <- c("Control", "Treatment", "Treatment")

ggplot(df2, aes(label, val)) +
geom_bar(fill="white", colour="darkgreen", alpha=0.5, stat="identity") +
  facet_grid(. ~ cyl2, scales = "free", space = "free") +
  theme(axis.title.x  = element_blank(), axis.title.y  = element_blank(),
        axis.ticks.x = element_blank(), legend.position = "none") +
  scale_y_continuous(breaks=c(4, 6, 8), labels=c("Minimal","Mild", "Moderate")) +
  scale_x_discrete(breaks=c(6,8,4), labels=c("Treat 1", "Treat 2", "Control"))
Marius
  • 58,213
  • 16
  • 107
  • 105
  • Touché, only detail is that this solution cannot reuse the code presented above. I am not saying I can't rewrite it to work from this data, but my question was if there is a _smart shortcut_ to unfold' the aggregate data. – Eric Fail Jan 30 '13 at 05:35
  • It's pretty rare that you can completely change the data without tweaking the plotting code a bit. Add a `cyl2` column to `df2`, e.g. `df2$cyl2 <- c("Control", "Treatment", "Treatment")`, and you should be able to keep using your `facet_grid()` call and most of the other elements of the plot. – Marius Jan 30 '13 at 05:45
  • @EricFail: I've redone it using your original plotting code now, the only things I needed to do were add in that `cyl2` variable and add `stat="identity"` to the `geom_bar` call. – Marius Jan 30 '13 at 05:58
  • Thanks. I appreciate you took the time to complete the details in your answer. – Eric Fail Jan 30 '13 at 06:00