4

I have 3 datasets df1, df2, df3, each containing three columns (csv files: https://www.dropbox.com/s/56qh1l5kchsiof0/datasets.zip?dl=0)

Each dataset represents a staked bar graph of the three columns, like so:

This example shows df3, where the three columns of the dataset df3.csv are stacked one on top of the other This example shows df3, where the three columns of the dataset df3.csv are stacked one on top of the other

Here's my r code to produce the above plot:

require(reshape2)
library(ggplot2)
library(RColorBrewer)

df = read.csv(".../df3.csv",sep=",", header=TRUE)

df.m = melt(df,c("density"))

c = ggplot(df.m, aes(x = density, y = value/1e+06,fill = variable)) + labs(x = "Density", y = "Cumulated ranks",fill = NULL)
c = c + geom_bar(stat = "identity", position = "stack") + scale_fill_grey(..., start = 0.2, end = 0.8, na.value = "grey50")

c = c + ggtitle('Relative valuation of 75-node resilient networks\naccording to their density')  + theme(plot.title = element_text(lineheight=.8, face="bold"))

c

I now need to build a facet plot where df1, df2, and df3 (each showing the three columns, staked) would share the same x axis scale, as such:

I'm sorry for the terrible doodle... Also, each subplot should be a stacked bar graph, as on figure 1, not a density plot I'm sorry for the terrible doodle... Also, each subplot should be a stacked bar graph, as on figure 1, not a density plot

Can I just do something like this:

require(reshape2)
library(ggplot2)
library(RColorBrewer)

df = read.csv(".../df1.csv",sep=",", header=TRUE)
df.m = melt(df,c("density"))
a = ggplot(df.m, aes(x = density, y = value/1e+06,fill = variable)) + labs(x = "Density", y = "Cumulated ranks",fill = NULL)
a = a + geom_bar(stat = "identity", position = "stack") + scale_fill_grey(..., start = 0.2, end = 0.8, na.value = "grey50")
a = a + ggtitle('subtitle 1')  + theme(plot.title = element_text(lineheight=.8, face="bold"))

df = read.csv(".../df2.csv",sep=",", header=TRUE)
df.m = melt(df,c("density"))
b = ggplot(df.m, aes(x = density, y = value/1e+06,fill = variable)) + labs(x = "Density", y = "Cumulated ranks",fill = NULL)
b = b + geom_bar(stat = "identity", position = "stack") + scale_fill_grey(..., start = 0.2, end = 0.8, na.value = "grey50")
b = b + ggtitle('subtitle 2')  + theme(plot.title = element_text(lineheight=.8, face="bold"))

df = read.csv(".../df3.csv",sep=",", header=TRUE)
df.m = melt(df,c("density"))
c = ggplot(df.m, aes(x = density, y = value/1e+06,fill = variable)) + labs(x = "Density", y = "Cumulated ranks",fill = NULL)
c = c + geom_bar(stat = "identity", position = "stack") + scale_fill_grey(..., start = 0.2, end = 0.8, na.value = "grey50")
c = c + ggtitle('subtitle 3')  + theme(plot.title = element_text(lineheight=.8, face="bold"))

all = facet_grid( _???_ )

Or need I organize my data differently?

Lucien S.
  • 5,123
  • 10
  • 52
  • 88
  • In the future, it would be better not to replay on others downloading potentially dangerous zipfiles. See [how to make a reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) for tips on how to create self-contained data you can add to your question. – MrFlick Sep 13 '14 at 19:40

1 Answers1

2

It would be easier if you re-organized your data. You want all your data to be in one data.frame so you would call ggplot once. In order to do this, you will need to stack all your melted data.frames and add a column indicating which file it came from. When i need to read in a bunch of files, I use a helper function called read.stack() but there are hundreds of different ways you can prepare your data.

Here's what I tried. First, we prepare the data

ff<-list.files("~/Downloads/datasets/", full=T);
dd<-read.stack(ff, sep=",", header=T, extra=list(file=basename(ff)))
mm<-melt(dd,c("density","file"))
head(mm)

#   density    file variable value
# 1    0.12 df1.csv     modu    50
# 2    0.12 df1.csv     modu   472
# 3    0.12 df1.csv     modu   145
# 4    0.12 df1.csv     modu    59
# 5    0.12 df1.csv     modu    51
# 6    0.12 df1.csv     modu    86

Notice how we just added a column indicating the source of the data which we will later use to specify a facet. Now we plot...

ggplot(mm, aes(x=density, y=value/1e6, fill=variable)) + 
    geom_bar(stat="identity", position="stack") + 
    scale_fill_grey(start = 0.2, end = 0.8, na.value = "grey50") +
    labs(x = "Density", y = "Cumulated ranks",fill = NULL) + 
    ggtitle('Relative valuation of 75-node resilient networks\naccording to their density') + 
    theme(plot.title = element_text(lineheight=.8, face="bold")) + 
    facet_grid( file~.)

And the result is

enter image description here

MrFlick
  • 195,160
  • 17
  • 277
  • 295
  • Oh, this looks so great, thanks a lot. One extra question if I may: how to, in the extra column added, put custom text instead of the file name. More precisely, I'd need the text "n = 25" instead of df1.csv, "n = 50" instead of df2.csv, and "n = 75" instead of df3.csv? – Lucien S. Sep 13 '14 at 20:27
  • 1
    The facets use the levels of the factor (ie `levels(mm$file)`). You may change those levels to whatever you like (ie `levels(mm$file) <- c("n=25","n=50","n=75")`). Just make sure you replace them in the proper order. – MrFlick Sep 13 '14 at 20:44