3

I've created the plot I need, but I'd like to slide the bars so that the center neutral category evenly straddles zero (x=0) for each subplot. Any ideas? Perhaps I'm not using the right geometric construct here?

library(ggplot2)
survey_data <- data.frame(gender=rep(c("Unreported","Female","Male"),7),
                      feel_job=c(1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5, 6, 6, 6, 7, 7, 7),
                      Freq=c(0, 0, 0, 1, 3, 5, 0, 4, 4, 0, 7, 15, 3, 28, 35, 3, 35, 80, 1, 52, 108))
p <- ggplot(survey_data, aes(gender)) + 
  geom_bar(aes(y = Freq, fill = factor(feel_job)), stat = "identity") +
  coord_flip()
p

Apparently these type of plots are also called "Diverging Stacked Bar Charts" in some circles.

Julius Vainora
  • 47,421
  • 9
  • 90
  • 102
Sterling
  • 65
  • 1
  • 6
  • Have a look at [this](http://stackoverflow.com/questions/13734368/ggplot2-and-a-stacked-bar-chart-with-negative-values) answer, which does something very similar, but in vertical rather than horizontal mode. – SlowLearner Jul 11 '13 at 22:37
  • I would think the problem with that plot is it screws up the ordering of the factors in the bar. I also don't like stacking as that doesn't let you see the actual size of the different bars. – Andy Clifton Jul 11 '13 at 23:14
  • OP, are you still out there? Want to give us some feedback? – Andy Clifton Jul 20 '13 at 14:59
  • Sorry for the delay. The plots used in the literature are usually stacked, but I see the value in leaving them unstacked. The HH package mentioned below will yield exactly what I wanted (if you can get your data in the format their package will easily accept, which is another task), but I'd rather use ggplot2 at this point. – Sterling Jul 23 '13 at 16:26

2 Answers2

2

I'm guessing that because you have 7 levels in the factor feel_job that you want level = 4 to be over the y axis. From the examples at http://learnr.wordpress.com/2009/09/24/ggplot2-back-to-back-bar-charts/, I figured there might be a way to cheat.

The key seems to be not to rely on ggplot to do everything. Instead, you have to create your statistics and then fiddle with them. I decided to try using geom_rect() rather than the vanilla geom_bar(), which means I need to give each bar values for xmin, xmax, ymin and ymax. The rest of this answer is going to show how to do that.

# save the data we were given
a.survey.data <-survey_data

# going to plot this as rectangles
a.survey.data$xmin[a.survey.data$feel_job < 4] = -a.survey.data$Freq[a.survey.data$feel_job < 4]
a.survey.data$xmin[a.survey.data$feel_job == 4] = -a.survey.data$Freq[a.survey.data$feel_job == 4]/2
a.survey.data$xmin[a.survey.data$feel_job > 4] = 0

a.survey.data$xmax[a.survey.data$feel_job < 4] = 0
a.survey.data$xmax[a.survey.data$feel_job == 4] = a.survey.data$Freq[a.survey.data$feel_job == 4]/2
a.survey.data$xmax[a.survey.data$feel_job > 4] = a.survey.data$Freq[a.survey.data$feel_job > 4]

# assign values to ymin and ymax based on gender and 
y.base <- NA
y.base[a.survey.data$gender == "Female"] = 1
y.base[a.survey.data$gender == "Male"] = 2
y.base[a.survey.data$gender == "Unreported"] = 3

a.survey.data$ymin <- y.base + (a.survey.data$feel_job-4)*0.1 - 0.05
a.survey.data$ymax <- y.base + (a.survey.data$feel_job-4)*0.1 + 0.05

# set the labels
a.survey.data$feel_job.cut <- factor(cut(a.survey.data$feel_job,
                                         breaks = c(0.5,1.5,2.5,3.5,4.5,5.5,6.5,7.5),
                                         labels = c("1",
                                                    "2",
                                                    "3",
                                                    "neutral",
                                                    "5",
                                                    "6",
                                                    "7"),
                                         ordered = TRUE))

p2 <- ggplot(data = a.survey.data,
             aes(xmax = xmax,
                 xmin = xmin,
                 ymax = ymax,
                 ymin = ymin)) +
  geom_rect(aes(fill = feel_job.cut)) +  
  scale_y_continuous(limits = c(0.5,3.5),
                     breaks=c(1,2,3), 
                     labels=c("Female","Male","Unreported"))
print(p2)

and then we en up with... enter image description here

Andy Clifton
  • 4,926
  • 3
  • 35
  • 47
  • I'd like to add that this is a bad hack even by my standards. – Andy Clifton Jul 11 '13 at 23:03
  • Not exactly what I was looking for, but close enough. This is strong work. From my perspective, though, the best tip you gave was to not rely so much on ggplot to do everything. – Sterling Jul 23 '13 at 16:28
0

You can do it using the likert function from the HH library:

library(HH)
survey_data <- data.frame(gender=rep(c("Unreported","Female","Male"),7),
                  feel_job=c(1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5, 6, 6, 6, 7, 7, 7),
                  Freq=c(0, 0, 0, 1, 3, 5, 0, 4, 4, 0, 7, 15, 3, 28, 35, 3, 35, 80, 1, 52, 108))
likert(Freq ~ gender + feel_job, survey_data)

I'm not sure if this is exactly what you're going for, but it looks to me like this (or something very similar) should work.

ben
  • 467
  • 3
  • 11
  • It's close. The code you posted doesn't quite work for what I need, but it looks like that package might be helpful. I'd like to stick with ggplot2, if possible, though. – Sterling Jul 11 '13 at 21:37