I'm trying to plot an overlayed graph of 2 professions and the probability of their incomes:
> data.frame(income = rep(c("$10 to $20", "$20 to $30", "$30 to $40"), 2), profession=c("A", "A", "A", "B", "B", "B"), prob=c(10, 50, 40, 20, 50, 30))
income profession prob
1 $10 to $20 A 10
2 $20 to $30 A 50
3 $30 to $40 A 40
4 $10 to $20 B 20
5 $20 to $30 B 50
6 $30 to $40 B 30
Unfortunately this doesn't work well when the income start having values like "100", beacause it gets sorted alphabetically so we get an x-axis like (10, 100, 20, 30).
When I have a single profession, I can use df$income <- factor(df$income, levels = df$income)
, but that doesn't work here:
> df$income <- factor(df$income, levels = df$income)
Error in `levels<-`(`*tmp*`, value = as.character(levels)) :
factor level [4] is duplicated
Is there any way around that?
That's how I'm trying to plot:
ggplot(df, aes(x=income, y=prob, fill=profession)) + geom_bar(stat='identity', position='identity', alpha=0.5)