0

I have written an R code that provides a Ggplot image as shown. I need some help with customisable colour codes for the "Yes", "No" and "Maybe" column.

More precisely, I would like to change colours that appear for each question based on the "Yes", "No" and "Maybe" responses.

I have tried the basics of ggplot. However, I am not able to make the colours more customisable then it already is.

library(dplyr)
library(ggplot2)
theme_set(theme_classic())
library(tidyverse)

data <- read.csv('data.csv', header = T, stringsAsFactors = F)
str(data)

data$stemmed <- factor(data$stemmed, levels=c("No", "Yes", "Maybe"))

data$QuestionNumber <- ordered(data$QuestionNumber, levels = c("Q1", "Q2", "Q3", "Q4", "Q5","Q6","Q7","Q8","Q9","Q10","Q11","Q12", "Q13", "Q14", "Q15", "Q16", "Q17"))

data$QuestionNumber = forcats::fct_rev(factor(data$QuestionNumber))

g <- ggplot(data, aes(QuestionNumber))
g + geom_bar(aes(fill=stemmed), width = 0.5) + 
  theme(axis.text.x = element_text(angle=65, vjust=0.6)) +
  labs(title="Histogram Plot") +  coord_flip()

The ggplot should provide colours that I have determined. For example, the "Yes" in Q1 could be GREEN in colour. But, the "Yes" in Q3 becomes "GREY" in colour as I denote. Is this possible ?

enter image description here

Data in dput format.

data <-
structure(list(QuestionNumber = structure(c(17L, 17L, 17L, 17L, 
17L, 17L, 17L, 17L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 15L, 15L, 
15L, 15L, 15L), .Label = c("Q17", "Q16", "Q15", "Q14", "Q13", 
"Q12", "Q11", "Q10", "Q9", "Q8", "Q7", "Q6", "Q5", "Q4", "Q3", 
"Q2", "Q1"), class = c("ordered", "factor")), stemmed = structure(c(2L, 
1L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 1L, 1L, 1L, 1L, 3L, 1L, 2L, 2L, 
3L, 2L, 3L), .Label = c("No", "Yes", "Maybe"), class = "factor")), row.names = c(NA, 
20L), class = "data.frame")
Rui Barradas
  • 70,273
  • 8
  • 34
  • 66
Dinesh
  • 654
  • 2
  • 9
  • 31
  • 1
    Can you post sample data? Please edit **the question** with the output of `dput(data)`. Or, if it is too big with the output of `dput(head(data, 20))`. – Rui Barradas Jan 15 '19 at 17:45
  • dput output has been added. Sorry for not doing this earlier. – Dinesh Jan 15 '19 at 17:53
  • 1
    For future questions you should take a look at https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example . Few side points. Remove unnecessary things like theme, read_csv is faster/better than read.csv, and replace the read_csv part with the dput output so your code runs in one command – Sahir Moosvi Jan 15 '19 at 18:07

1 Answers1

1

You can set color using scale_fill_manual or scale_color_manual depending on what you want. There you can specify the hex code for the color you want. If you name the list like I have you don't have to guess the legend order either.

g + geom_bar(aes(fill=stemmed), width = 0.5) + 
  theme(axis.text.x = element_text(angle=65, vjust=0.6)) +
  labs(title="Histogram Plot") +  coord_flip() +
  scale_fill_manual( values = c("No" = "#ff0000","Yes" = "#00ff00", "Maybe" = "#0000ff") )
Sahir Moosvi
  • 549
  • 2
  • 21
  • Thanks Sahir, If I'm understanding your code correctly, this will have a global effect on all the values of Yes/No/Maybe. How if I'm trying to have more flexibility with the colours for each question that appears on the output ? – Dinesh Jan 15 '19 at 18:06
  • 1
    Then you should create a new variable that determines the value based on whatever your conditons are for the colour and then set those values to the colour as demonstrated by Sahir – see24 Jan 15 '19 at 18:08
  • 1
    Yes as @see24 says you would have to create multiple variables in order to exactly specify when to have each color. But I'd also say that I would find that really confusing. Yes should be yes – Sahir Moosvi Jan 15 '19 at 18:14