library(data.table)
library(ggplot2)
d <- data.table(setvalue = c("1.c , 1.d , 1.f , 2.b ", "1.b , 1.d , 1.f , 2.f ", "1.c , 1.d , 2.f , 2.h ", "1.b , 1.d , 1.f , 2.i ","1.c , 1.d , 2.f , 3.j "),
pct = c(0.06, 0.04, 0.028, 0.026, 0.017),
cumpct = c(0.06, 0.10, 0.128, 0.156, 0.173))
break_at_comma <- function(x) {gsub(",", "\n", x)}
ggplot(d, aes(x=reorder(setvalue, cumpct, sum), y=pct))+geom_bar(stat="identity")+
theme_bw()+
scale_y_continuous(labels=scales::percent, name="Procent of all combinations")+
scale_x_discrete(name="chosen combinations", labels=break_at_comma)
However - the 'combinations' listed on the x axis have meaning. So I'd like to color text that starts with "1.[a-z]{1}" green, text that start with "2.[a-z]{1}", yellow, and thext that starts with "3.[a-z]{1}" red.
I hope this makes sense. The end result should look something like this (labels are repeated, so only look at the colors):