2

I'm looking at behavior of different groups of people (called Clusters in this data set) and their preference for the type of browser they use. I want to create a bar graph that shows the percentage of each cluster that is using each type of browser.

Here is some code to generate a similar dataset (please ignore that the percentages for each cluster will not add up to 1):

browserNames <- c("microsoft","mozilla","google")
clusterNames <- c("Cluster 1","Cluster 2","Cluster 3")
percentages <- runif(n=length(browserNames)*length(clusterNames),min=0,max=1)

myData<-as.data.frame(list(browserNames=rep(browserNames,3),
                           clusterNames=rep(clusterNames,each=3),
                           percentages=percentages))

Here's the code I've been able to come up with so far to get the graph I desire:

ggplot(myData, aes(x=browserNames, y=percentages, fill=factor(clusterNames))) +
    geom_bar(stat="identity",position="dodge") +
    scale_y_continuous(name="Percent Weight", labels=percent)

I want the fill for each cluster to be a gradient fill with high and low values that I determine. So, in this example, I would like to be able to set 3 high and low values for each cluster that is represented.

I've had trouble with the different scale_fill commands, and I'm new enough to ggplot that I am pretty sure I'm probably just doing it wrong. Any ideas?

Edit: Here is a picture of what I'm looking for:

enter image description here

(Original image available at https://www.dropbox.com/s/py6hifejqz7k54v/gradientExample.bmp)

Brian Diggs
  • 57,757
  • 13
  • 166
  • 188
jLangford
  • 21
  • 3
  • Just so we're clear, you'd want the Cluster1 bars to be different shades of red (for example) based on how tall they are, Cluster2 bars to be different shades of green based on how tall they are, etc.? – joran Jan 14 '14 at 20:27
  • You're exactly right, Joran. – jLangford Jan 14 '14 at 21:21
  • I'm giving you +1 because the question is well formulated (after the edit); the discussion with jlhoward's answer explains why it is not a good idea (and thus not directly possible in `ggplot2`) – Brian Diggs Jan 14 '14 at 23:32

1 Answers1

3

Is this close to what you had in mind??

# color set depends on browser
library(RColorBrewer)     # for brewer.pal(...)
gg        <- with(myData, myData[order(browserNames,percentages),])
gg$colors <- 1:9
colors    <- c(brewer.pal(3,"Reds"),brewer.pal(3,"Greens"),brewer.pal(3,"Blues"))

ggplot(zz, aes(x=browserNames, y=percentages, 
               fill=factor(colors), group=factor(clusterNames))) +
  geom_bar(stat="identity",position="dodge", color="grey70") + 
  scale_fill_manual("Cluster", values=colors, 
                    breaks=c(3,6,9), labels=c("Google","Microsoft","Mosilla"))

# color set depends on cluster
library(RColorBrewer)     # for brewer.pal(...)
gg        <- with(myData, myData[order(clusterNames,percentages),])
gg$colors <- 1:9
col    <- c(brewer.pal(3,"Reds"),brewer.pal(3,"Greens"),brewer.pal(3,"Blues"))

ggplot(gg, aes(x=browserNames, y=percentages, 
               fill=factor(colors), group=factor(clusterNames))) +
  geom_bar(stat="identity",position="dodge", color="grey70") + 
  scale_fill_manual("Cluster", values=col, 
                    breaks=c(3,6,9), labels=c("Cluster1","Cluster2","Cluster3"))

jlhoward
  • 58,004
  • 7
  • 97
  • 140
  • I think (per my comment above) they wanted a different grouping, but this should be easily modified to handle that. The other option I was thinking of was to simply map `percentages` to the alpha aesthetic. – joran Jan 14 '14 at 21:24
  • That's close, jlhoward, but what I'm looking for is what Joran has described. Joran, could you expand on mapping percentages to the alpha aesthetic? – jLangford Jan 14 '14 at 21:27
  • @user3195549 jlhoward's method will work too, it will just require shuffling the groups around. If you are patient, I'm sure they'll edit to show you. My method would simply have entailed including `alpha = percentages` inside `aes()`, but I think jlhoward's method will look nicer and be more flexible. – joran Jan 14 '14 at 21:30
  • @jlhoward That is SO close! I want the high and low of the gradient to be within the same box. Each bar for each cluster will look the same, just different heights. Here's a link to what I want the picture to look like: https://www.dropbox.com/s/py6hifejqz7k54v/gradientExample.bmp – jLangford Jan 14 '14 at 21:44
  • 2
    I was afraid of that, but then @joran's comment seemed to take this in a different (better) direction. If there's a way to do this, I don't know it. I urge you to think carefully about whether this would make your data easier to understand. This is the core principle of ggplot. – jlhoward Jan 14 '14 at 21:59
  • 3
    I will second @jlhoward's comment. The image you link to doesn't match what we had in mind at all. The gradient you describe adds nothing to the graphic, and would be a prime example of "chart junk". That kind of thing is made difficult, nigh on impossible in ggplot on purpose, because it is widely considered to be a bad idea. – joran Jan 14 '14 at 22:12
  • I'm not disagreeing with your "chart junk" description. I'll just provide a little background ... I work for a large digital marketing company, and we had a professional design team create a template for the presentations that we create everyday for clients. I'm simply trying to match their formatting and create an automated script that will create the graphs for me. – jLangford Jan 14 '14 at 22:17
  • @jLangford Well, you aren't likely to be able to do this (easily) in ggplot, that's all. Technically, anything is possible if you delve into the grid internals, but it will not be pleasant. – joran Jan 14 '14 at 22:47
  • Thanks for entertaining my question! I was just trying to avoid going into Excel and making the graphs, but it's looking like that will be the easier route. – jLangford Jan 14 '14 at 23:03
  • @jLangford Can't you automate Excel for this? If you need R for the statistics, you could automate R from inside Excel using VBA and [RExcel](http://www.statconn.com/products.html) ([Documentation here](http://www.unt.edu/rss/class/splus/UsingRWithinExcel.pdf)). – jlhoward Jan 14 '14 at 23:04
  • I hate adding superfluous comments, but this discussion is exactly what newcomers to R and ggplot2 should be pointed to, coming from Excel it is eye opening how many of the "effects" added to charts are just plain unnecessary. Thanks joran, jlhoward for the comments and thanks jLangford for bringing this up. – americo Jan 14 '14 at 23:20