1

I have multiple csv files with same structure of data

url, A,B,C,D
a.com,1,2,3,4
b.com,3,4,5,6

I can create a stacked bar plot with urls on x-axis and A,B,C,D stacked on top of each other.

Now I want to create clustered stacked bar plots, with multiple such csv files, all indexed by the same url on the x axis.

data1 = read.csv("data.csv")
data2 = read.csv("data2.csv")
data.m = melt(data1, id.var="url")

ggplot(data.m, aes(x = url, y = value, fill = variable)) + 
  geom_bar(position="fill",stat = "identity")

Basically add data2 to the plot. Not sure if I am supposed to use gather or facets or manually create new columns post melt?

It should look something like this: enter image description here

Ayush Goel
  • 435
  • 6
  • 16

1 Answers1

2

Is this what you're after?

# Two sample datasets
df1 <- cbind.data.frame(
    url = c("a.com", "b.com"),
    A = c(1, 3), B = c(2, 4), C = c(3, 5), D = c(4, 6));

df2 <- cbind.data.frame(
    url = c("a.com", "b.com"),
    A = c(5, 7), B = c(6, 8), C = c(7, 9), D = c(8, 10));

Using gather

# Using gather
require(tidyr);
df <- rbind.data.frame(
    gather(cbind.data.frame(df1, src = "df1"), variable, value, -url, -src),
    gather(cbind.data.frame(df2, src = "df2"), variable, value, -url, -src));

Using melt

# Using melt
require(reshape2);
df <- rbind.data.frame(
    melt(cbind.data.frame(df1, src = "df1"), id.vars = c("url", "src")),
    melt(cbind.data.frame(df2, src = "df2"), id.vars = c("url", "src")));

Sample plot

ggplot(df, aes(x = url, y = value, fill = variable)) + geom_bar(stat = "identity") + facet_wrap(~ src);

enter image description here

Note: If you have multiple csv files, best to df.list <- lapply(..., read.csv), and then melt df.list to get columns variable, value and L1 (which corresponds to src).


Update

I'm not entirely clear on what you are after, so this is a bit of a stab in the dark. You can also cluster by url (instead of src):

ggplot(df, aes(x = src, y = value, fill = variable)) + geom_bar(stat = "identity") + facet_wrap(~ url);

enter image description here

and/or show bars side-by-side (instead of stacked)

ggplot(df, aes(x = src, y = value, fill = variable)) + geom_bar(stat = "identity", position = "dodge") + facet_wrap(~ url);

enter image description here

Maurits Evers
  • 49,617
  • 4
  • 47
  • 68
  • Nope. df2 has the same values for the x-axis, ie a.com and b.com. Want these two stacked bar plots to be clustered. – Ayush Goel Nov 02 '17 at 23:17
  • See my updated answer. You don't mention that you have the same `url` entries in all datasets, so people are left guessing... – Maurits Evers Nov 02 '17 at 23:23
  • I'll edit my question to make it more clear, but since I mentioned clustered barplots, I want both the a.com plots next to each other. Something like this https://stackoverflow.com/questions/18774632/how-to-produce-stacked-bars-within-grouped-barchart-in-r – Ayush Goel Nov 03 '17 at 00:22
  • Provided I understand you correctly, you can simply `facet_wrap` by `url` and plot `src` on the x-axis. See my updated question. – Maurits Evers Nov 03 '17 at 02:08
  • Thanks for updating your answer. I thought it was clear what I wanted when I wrote clustered stacked bar plots. I've anyway updated my answer with a picture of what exactly I want. – Ayush Goel Nov 03 '17 at 16:23
  • @AyushGoel How is the second figure I show different from your picture? Simply facet wrap by `url` and show different stacked bars for every `csv` file. You can fine-tune the plot to add extra axis labels etc. I don't see how my solution does not answer your question. – Maurits Evers Nov 04 '17 at 02:46