2

I believe my question is very similar to this post. Only difference is my aes fill is a factor with multiple levels. This what I am after enter image description here

and this is how far I have gotten

set.seed(123)
n = 100

LoanStatus = sample(c('Chargedoff', 'Completed', 'Current', 'Defaulted', 'PastDue'), n, replace = T, prob = NULL)
ProsperScore = sample(1:11, n, replace = T, prob = NULL)

df = data.frame(ProsperScore,factor(LoanStatus))
df = data.frame(ProsperScore,LoanStatus)

probs = data.frame(prop.table(table(df),1))
CRich
  • 118
  • 1
  • 1
  • 7
  • 1
    Images of data are nearly worthless - good ways to share data are (a) use built-in data, (b) share code to simulate sample data (use `set.seed()` so any randomness is reproducible), or (c) use `dput()` to share a copy/pasteable sample of your data. There are lots more good ideas for making R reproducible examples [in this question](https://stackoverflow.com/q/5963269/903061). – Gregor Thomas Aug 10 '17 at 17:52
  • 1
    Similarly, you show your code in an image. It is difficult to read and impossible to copy/paste to try out. Share code as text (formatted nicely) so that it is easy to read and easy to copy. – Gregor Thomas Aug 10 '17 at 17:54
  • @Gregor I redid the whole post, any other suggestions? Thank you. – CRich Aug 11 '17 at 08:40
  • Much better, I'll try to find some time today to answer (unless someone does it first) – Gregor Thomas Aug 11 '17 at 15:31

1 Answers1

1

Code for the stacked bar plot could look something like this:

library(ggplot2)

brks <- c(0, 0.25, 0.5, 0.75, 1)

ggplot(data=probs,aes(x=ProsperScore,y=Freq,fill=LoanStatus)) +
  geom_bar(stat="identity") +
  scale_y_continuous(breaks = brks, labels = scales::percent(brks)) +
  scale_x_discrete(breaks = c(3,6,9))

More complete code, demonstrating how you would go about adding percentages to the plot, is here:

library(ggplot2)
library(plyr)

brks <- c(0, 0.25, 0.5, 0.75, 1)

probs <- probs %>% dplyr::group_by(ProsperScore) %>%
  dplyr::mutate(pos=cumsum(Freq)-(Freq*0.5)) %>%
  dplyr::mutate(pos=ifelse(Freq==0,NA,pos))

probs$LoanStatus <- factor(probs$LoanStatus, levels = rev(levels(probs$LoanStatus))) 

ggplot(data=probs,aes(x=ProsperScore,y=Freq,fill=LoanStatus)) +
  geom_bar(stat="identity") +
  scale_y_continuous(breaks = brks, labels = scales::percent(brks)) +
  scale_x_discrete(breaks = c(3,6,9)) +
  geom_text(data=probs, aes(x = ProsperScore, y = pos,
                                  label = paste0(round(100*Freq),"%")), size=2)

enter image description here

To only show the percentages in the first column of the graph, add %>% dplyr::mutate(pos=ifelse(ProsperScore==1,pos,NA)) to the dplyr calls.

gregor-fausto
  • 435
  • 2
  • 9
  • That did not work, I just got errors no matter what library I loaded. I am unsure how to mutate the chunk of code with pipeline operators. How would I do so? Here is a [screenshot](https://photos.google.com/share/AF1QipN_-ItSXqzUhZmVl522QEr3mpqeePqWuruxnPaBVkG7KSZ8ZQgmnBnByeJ-Jlx-6g?key=enFIZl9ZdzRaVk1vTXdGZ1c3Q1o4dVZGMDlIbFF3) of my sessionInfo and errors. – CRich Aug 15 '17 at 09:08
  • 1
    Sorry, there was an extra `%>%`. Does that work for you now? – gregor-fausto Aug 15 '17 at 13:42