-1

I have a Gender variable, and 10 different treatment variables. I used geom_histogram in ggplot. Hence, for each treatment I want to see the percentage of Males and Females, but all standarized so that they y axis max value could be 1, and therefore I could compare by percentages.

In particular, I am interested in generating a graph like this one

https://cengel.github.io/R-data-viz/R-data-viz_files/figure-html/color-bar-gender-1.png

Some of the data would like something like this

structure(list(Treatment = structure(c(3L, 3L, 3L, 3L, 3L, 4L
), .Label = c("", "{\"ImportId\":\"Treatment\"}", "Altruism", 
"Altruism - White", "Piece Rate - 0 cents", "Piece Rate - 3 cents", 
"Piece Rate - 6 cents", "Piece Rate - 9 cents", "Reciprocity", 
"Reciprocity - Black", "Reciprocity - White", "Treatment"), class = "factor"), 
    Gender = structure(c(5L, 3L, 5L, 5L, 5L, 3L), .Label = c("", 
    "{\"ImportId\":\"QID2\"}", "Female", "Gender you most closely identify with: - Selected Choice", 
    "Male", "Other", "Prefer not to answer"), class = "factor"),class="data.frame")
ggplot(Data1, aes(x=Treatment, fill=Gender))+
  geom_histogram(bins = 15, col="black",stat="count")+
  ggtitle("Gender")+
  xlab("Treatment")+ylab("Density")+
  theme_classic()+
  theme(axis.line = element_blank(),
        axis.ticks = element_blank())

I get something like this

My output

This is my code thus far, which works. The only thing I don't know how to do is to have all my bars of the same height (1, like a standarized value), such that I can see percentages (divided bars) for each treatment.

Alex Ruiz
  • 139
  • 10
  • Please [make this question reproducible](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) by including some or all of the data `Data1` as plain text. Also I think you want a stacked bar chart, not a histogram. You'll find many examples if you search this site. – neilfws Jun 05 '19 at 00:51
  • I hope my edits now can help – Alex Ruiz Jun 05 '19 at 00:58
  • @AlejandroRuiz please read the link to make your edit better – M-- Jun 05 '19 at 01:01
  • https://stackoverflow.com/a/21236366/9699371 – bbiasi Jun 05 '19 at 01:07
  • Not all treatments have the same number of observations, so I am looking to set the height of each bar to, say 1, and the bars be all of heigh 1, just with a different distribution of the percentages they are formed with – Alex Ruiz Jun 05 '19 at 01:10
  • There is sure to be a very similar existing question with answers, just a question of identifying the best one :) – neilfws Jun 05 '19 at 01:14

1 Answers1

2

Let's generate some example data:

library(dplyr)
library(ggplot2)

set.seed(1001)
Data1 <- data.frame(Treatment = sample(LETTERS[1:5], 100, replace = TRUE),
                    Gender    = sample(c("Male", "Female"), 100, replace = TRUE))

Now we can use dplyr::count and the key is to use position = "fill":

Data1 %>% 
  count(Treatment, Gender) %>% 
  ggplot(aes(Treatment, n)) + 
  geom_col(aes(fill = Gender), position = "fill")

enter image description here

neilfws
  • 32,751
  • 5
  • 50
  • 63