0

I was wondering how to add more advanced levels of color scheming with ggplot2 without having to set it up manually.

Heres a pic of what I have so far: current graph

I would like to make it such that all the data follows a gradient, but each Actinobacteria has different shadings of a color, same with Firmicutes, and so on. I can assign a variable to my data frame that represents the taxonomical group that it is in, but how would I use this or some other method to make the color changes that I am looking for?

Heres how the data (prior to melting) looks: data that gets plotted

The sample column is irrelavent. The nseqs is also irrelevent. The group column is used to plot to the proper grid, in this case based on race. The bacteria types are also columns. Each row contains a proportion/percentage of bacteria type. Thus the sum of each rows bacteria type proportions adds up to 100. The xorder is used to order the bars in their respective positions within the grid, and is predetermined.

Again, I want the geom bars to be colored not only by their fill=variable, but also the type that each variable is. All the actinobacteria with similar shading and so on.

This is the current segment of code that deals with plotting:

fdata$nseqs <- factor(fdata$nseqs)
fdata$xorder <- factor(fdata$xorder)
ggfdata <- melt(fdata, id.var=c('group','nseqs','sample', 'xorder'))
p <- ggplot(ggfdata, aes(x=xorder, y=value, fill = variable)) + geom_bar(stat='identity') + facet_grid(~group, scales='free_x', space='free_x') + scale_y_continuous() + labs(title=paste('Taxonomic Distribution - grouped by',colnames(meta.frame)[i])) + ylab('Percentage') + xlab('(sorted within group by increasing number of sequences)') + theme(axis.ticks.x = element_blank(), axis.text.x = element_blank()) + scale_x_discrete() + scale_color_brewer()
        print(p)


SIMPLIFIED RAW DATASET (ONLY 3 SAMPLES IE BARS) (image of data in excel table) enter image description here

Here is the dput() of the melted raw simplified data:

samplesstructure(list(group = c("af", "as", "af", "af", "as", "af", 
"af", "as", "af", "af", "as", "af", "af", "as", "af", "af", "as", 
"af", "af", "as", "af", "af", "as", "af", "af", "as", "af", "af", 
"as", "af"), nseqs = structure(c(1L, 1L, 2L, 1L, 1L, 2L, 1L, 
1L, 2L, 1L, 1L, 2L, 1L, 1L, 2L, 1L, 1L, 2L, 1L, 1L, 2L, 1L, 1L, 
2L, 1L, 1L, 2L, 1L, 1L, 2L), class = "factor", .Label = c("1", 
"2")), sample = c("Abidjan.534R", "Tanger.534R", "Salvador.534R", 
"Abidjan.534R", "Tanger.534R", "Salvador.534R", "Abidjan.534R", 
"Tanger.534R", "Salvador.534R", "Abidjan.534R", "Tanger.534R", 
"Salvador.534R", "Abidjan.534R", "Tanger.534R", "Salvador.534R", 
"Abidjan.534R", "Tanger.534R", "Salvador.534R", "Abidjan.534R", 
"Tanger.534R", "Salvador.534R", "Abidjan.534R", "Tanger.534R", 
"Salvador.534R", "Abidjan.534R", "Tanger.534R", "Salvador.534R", 
"Abidjan.534R", "Tanger.534R", "Salvador.534R"), xorder = structure(c(1L, 
2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 
3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L), class = "factor", .Label = c("1", 
"2", "3")), variable = structure(c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 
3L, 3L, 4L, 4L, 4L, 5L, 5L, 5L, 6L, 6L, 6L, 7L, 7L, 7L, 8L, 8L, 
8L, 9L, 9L, 9L, 10L, 10L, 10L), .Label = c("Actinobacteria (Propionibacterium)", 
"Actinobacteria (other)", "Actinobacteria (Corynebacterium)", 
"Actinobacteria (Micrococcaceae)", "Firmicutes (Streptococcus)", 
"Firmicutes (other)", "Firmicutes (Staphylococcus)", "Firmicutes (Clostridiales_Incertae_Sedis_XI)", 
"Proteobacteria", "other"), class = "factor"), value = c(28.4151076597247, 
14.3313054752169, 68.823080833757, 25.5794799388163, 19.8663608257704, 
6.17691916624301, 28.9210495352394, 51.4411090056847, 6.72343670564311, 
0.176491351923756, 0.1196768724444, 4.28317234367056, 1.00011766090128, 
0.3889498354443, 2.99949161159126, 1.69431697846806, 2.4434028124065, 
4.30859176410778, 2.05906577244382, 1.1169841428144, 0.699034062023386, 
9.80115307683257, 7.28034307370101, 0.59735638027453, 1.6825508883398, 
2.3237259399621, 3.34265378749364, 0.670667137310272, 0.688142016555301, 
2.04626334519573)), .Names = c("group", "nseqs", "sample", "xorder", 
"variable", "value"), row.names = c(NA, -30L), class = "data.frame")
alexwhan
  • 15,636
  • 5
  • 52
  • 66
Nik
  • 23
  • 3
  • Please provide a reproducible example http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example, at the very least some data to work with, and the code you've used to produce your plot – alexwhan Aug 05 '13 at 12:28
  • There. Hopefully its easier to work with. The entire code which involves processing and whatnot is a pain to go through, and I included the significant parts that deal with graphing, and an idea of my final data frame that gets plotted. – Nik Aug 05 '13 at 13:11
  • An image is quite an impossible way to work with the data! Try posting the output of `dput(ggfdata)` or, better yet, make a simplified example dataset. – alexwhan Aug 05 '13 at 13:20
  • I simplified the dataset to only 3 samples, and I included dput of its melted data, along with a pic of an excel document containing the dataframe in case it helps to get an idea of what the data looks like. Does this work, or should I include anything else? – Nik Aug 05 '13 at 13:49

0 Answers0