-2

My dataset in R looks like the following:

a <- c("M","F","F","F","M","M","F","F","F","M","F","F","M","M","F")
p <- c("P","P","W","W","P","P","W","W","W","W","P","P","P","W","W")
y1 <- c("yes","yes","null","no","no","no","yes","null","no","yes","yes","yes","null","no","no")
y2 <- c("yes","null","no","no","no","yes","yes","yes","null","no","yes","null","no","yes","yes")
y3 <- c("no","no","no","yes","null","yes","null","no","no","no","yes","yes","null","no","no")
VE <- data.frame(gender = a,
             type = p,
             y1 = y1,
             y2 = y2,
             y3 = y3)

And I would like to create a bar chart which looks like this: ideal bar chart

I just figured out a long way to get the chart:

q<-data.frame(gender=VE$gender,
          year=rep("y1",15),
          group=VE$y1)
p<-data.frame(gender=VE$gender,
          year=rep("y2",15),
          group=VE$y2)
x<-data.frame(gender=VE$gender,
          year=rep("y3",15),
          group=VE$y3)
Table<-rbind(q,p,x)
ggplot(Table, aes(year)) + geom_bar(aes(fill=group), position = "stack") + facet_grid(gender~.)

Is there any better way to get the bar chart? (since I was originally going to deal with 3,000,000 obsevations which have 32 variables each) Please give me some kind help with this bar chart. Cheers!

J.wz
  • 13
  • 5
  • 3
    showing your data in an image does not make it easy to duplicate. Read [this](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) and the help and edit your question. Or search for your [question](https://stackoverflow.com/questions/38877580/stacked-bar-plot-with-4-categorical-variables-in-r?rq=1) It is in the related questions on the side of your questions after all. – shea Oct 09 '17 at 21:11
  • Possible duplicate of [Stacked bar plot with 4 categorical variables in R](https://stackoverflow.com/questions/38877580/stacked-bar-plot-with-4-categorical-variables-in-r) – shea Oct 09 '17 at 21:11

1 Answers1

0

First you can melt your data.frame to get a 'long' format. For this I have created an ID variable, the 3 variables 'y1, 'y2', and 'y3' are put together into one variable. You can then use ggplot2 and use geom_bar() which will count the values in the x aesthetic if no y aesthetic is provided.

library(ggplot2)

# create data frame
df <- data.frame(ID = 1:15, 
             gender = c('M', 'F', 'F', 'F', 'M', 'M', 'F', 'F', 'F', 'M', 'F', 'F', 'M', 'M', 'F'),
             type = toupper(c('p', 'p', 'w', 'w', 'p', 'p', 'w', 'w', 'w', 'w', 'p', 'p', 'p', 'W', 'W')),
             y1 = c('yes', 'yes', 'null', 'no', 'no', 'no', 'yes', 'null', 'no', 'yes', 'yes', 'yes', 'null', 'no', 'no'),
             y2 = c('yes', 'null', 'no', 'no', 'no', 'yes', 'yes', 'yes', 'null', 'no', 'yes', 'null', 'no', 'yes', 'yes'),
             y3 = c('no', 'no', 'no', 'yes', 'null', 'yes', 'null', 'no', 'no', 'no', 'yes', 'yes', 'null', 'no', 'no'),
             stringsAsFactors = TRUE)

# melt data frame to long format
df_melt <- data.table::melt(df[, c(1, 4:6)], id.vars = "ID")

# set correct levels for factor (needed for the legend)
df_melt$value <- factor(df_melt$value, levels = c("yes", "no", "null"))

# add ggplot
ggplot(data = df_melt) + 
  geom_bar(aes(x = variable, fill = value, colour = value)) +
  ylab("count") +
  xlab("year")

Which returns:

output_ggplot

clemens
  • 6,653
  • 2
  • 19
  • 31
  • Basically, it's the melt() function in "reshape" package which automatically implements my long-way transform. Thanks! – J.wz Oct 09 '17 at 22:20