-1

I am learner of R so excuse me for mistakes. I am having a dataframe

QCVNTO=structure(list(O = c(1.34242268082221, 0.903089986991944, 2.55870857053317, 
2.40823996531185, 1.65321251377534, 0.903089986991944, 1.20411998265592, 
1.20411998265592, 1.20411998265592, 0.903089986991944, 0.903089986991944, 
1.65321251377534, 1.34242268082221, 1.04139268515823, 0.903089986991944, 
1.34242268082221, 1.34242268082221, 3.01029995663981, 1.34242268082221, 
1.34242268082221, 1.80617997398389, 0.903089986991944, 0.903089986991944, 
1.34242268082221, 1.34242268082221, 1.65321251377534, 1.20411998265592, 
1.04139268515823, 1.04139268515823, 1.65321251377534, 1.34242268082221, 
1.65321251377534, 0.903089986991944, 0.903089986991944, 0.903089986991944, 
0.903089986991944, 1.34242268082221, 1.34242268082221, 1.04139268515823, 
0.903089986991944, 0.903089986991944, 0.903089986991944, 1.95424250943932, 
0.903089986991944, 0.903089986991944, 1.80617997398389, 1.34242268082221, 
1.50514997831991, 1.34242268082221, 2.25767857486918, 1.80617997398389, 
1.95424250943932, 2.10720996964787, 1.50514997831991, 1.50514997831991, 
1.50514997831991, 1.50514997831991, 1.50514997831991, 1.95424250943932, 
1.95424250943932, 1.34242268082221, 1.50514997831991, 1.50514997831991, 
2.40823996531185, 1.65321251377534, 1.65321251377534, 1.50514997831991, 
1.50514997831991, 1.50514997831991, 1.80617997398389, 1.50514997831991, 
1.50514997831991, 1.80617997398389, 1.50514997831991, 1.50514997831991, 
1.34242268082221, 1.34242268082221, 1.50514997831991, 2.55870857053317, 
1.65321251377534, 1.80617997398389, 2.10720996964787, 1.80617997398389, 
1.80617997398389, 1.65321251377534, 3.01029995663981, 1.65321251377534, 
2.40823996531185, 1.80617997398389, 1.80617997398389, 1.65321251377534, 
2.40823996531185, 1.80617997398389, 1.04139268515823, 1.65321251377534, 
1.80617997398389, 2.40823996531185, 1.65321251377534, 3.01029995663981, 
1.95424250943932, 1.80617997398389, 1.80617997398389, 1.50514997831991, 
2.10720996964787, 1.65321251377534, 1.80617997398389, 1.50514997831991, 
1.80617997398389, 2.70926996097583, 1.65321251377534, 1.95424250943932, 
2.25767857486918, 2.10720996964787, 1.65321251377534, 1.80617997398389, 
1.80617997398389, 1.50514997831991, 1.80617997398389, 0.903089986991944, 
3.01029995663981, 2.55870857053317, 1.04139268515823, 1.80617997398389
), ProtectionStatus = c(1, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 0, 
0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 
0, 1, 1, 1, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 
1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 
0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 
0, 1, 1, 0, 1)), .Names = c("O", "ProtectionStatus"), row.names = c(NA, 
-123L), class = "data.frame")

Then i have calculated the freq of protectionstatus for each 'O'class using the code

df=as.data.frame(xtabs(~ ProtectionStatus + O, data = QCVNTO))

Then I have plotted stacked percentage bar plot showing percentage bar of Protection Status of each 'O' class using ggplot

ggplot(df,aes(x = O, y = Freq, fill = ProtectionStatus)) +  
geom_bar(position = "fill",stat = "identity") +  
scale_y_continuous(labels = percent, breaks = seq(0,1,by=0.1))+  
labs(title = "Log 10SN50 Vs Percentage of Protection", y = "Percentage of Protection", x = "Log 10SN50")

I have 3 questions after this step.

1.The resulting plot is having x-axis digits overlapping. can any one show me how to reduce the number of decimals to 2 in x axis?

  1. I need to show the percentages inside the bar.

  2. I need to show in top of each bar the number of oberservations/count for each 'O'class. I have read How to center stacked percent barchart labels [want to create plot as answered by [eipi10][1] and tried with this code

    df.summary = QCVNTO %>% group_by(O) %>% +
          summarise(ProtectionStatus = count(ProtectionStatus)) %>%   
          mutate(percent = ProtectionStatus/sum(ProtectionStatus),
                 pos = cumsum(percent) - 0.5*percent)
    
    ggplot(df.summary,aes(x=QCVNTO$O,QCVNTO$ProtectionStatus,
              function(x)+sum(x)),y=percent,fill=Category) + 
       geom_bar(stat='identity',  width = .7, colour="black", lwd=0.1) +
       geom_text(aes(label=ifelse(percent >= 0.07, paste0(sprintf("%.0f",    
                 percent*100),"%"),""),y=pos), colour="white") +
      coord_flip() +  scale_y_continuous(labels = percent_format()) +                
      labs(y="", x="")
    

but it shows the error Aesthetics must be either length 1 or the same as the data (2): x, y.

I really thank you all for providing your valuable time in reading this question.

Community
  • 1
  • 1

1 Answers1

2

Given

p <- ggplot(df,aes(x = O, y = Freq, fill = ProtectionStatus)) +  
  geom_bar(position = "fill",stat = "identity") +  
  scale_y_continuous(labels = scales::percent, breaks = seq(0,1,by=0.1))+  
  labs(title = "Log 10SN50 Vs Percentage of Protection", y = "Percentage of Protection", x = "Log 10SN50")

you could do

library(scales)
p + geom_text(
    aes(y = Freq, label = ifelse(Freq<1&Freq>0, percent(Freq), NA)),
    data=transform(df, Freq=Freq/ave(Freq, O, FUN=sum)),
    position = position_stack(vjust = 0.5)) + 
  geom_text(aes(y=1, label = with(df, ave(Freq, O, FUN=sum))), vjust=-.5) +
  scale_x_discrete(labels = function(x) round(as.numeric(x), digits=2)) 

enter image description here

lukeA
  • 53,097
  • 5
  • 97
  • 100
  • Thankyou very much for the answer. it has almost solved my issue, but in graph i don't want to show 0% in 100% bars and 100% in 0% bars, because in the higher classes of 'O', the ProtectionStatus=1 is 100% but the 0% is shown and overlapping with the number of observations shown for each 'O' class. can you correct the code for this? thanking you a lot @ krish for edit and @ lukeA for excellent and very quick reply. – R.P. Tamil Selvan Jan 07 '17 at 07:30
  • but in 100 % bars the 100% is not visible, i want to display 100% in higher 'O' classes because the 0% is not represented in bar. otherwise also if am able to show percentages for only of the categories(ProtectionStatus) will also be fine. the main objective is especially to show protectionstatus decreases when the 'O'class increases, which is possible i think with reversing the fill order. – R.P. Tamil Selvan Jan 07 '17 at 10:35
  • Use `ifelse(Freq>0, percent(Freq), NA)`? – lukeA Jan 07 '17 at 13:03
  • Yes it solved my issue. i will mark it as answered. As a subquestion, can you explain me how to show the percentage for only one of the two categories for each x axis class, for example (ProtectionStatus ==0) and to reverse the order of the stacking. i tried ' p =ggplot(df[order(df$ProtectionStatus, decreasing = T),],aes(x = O, y = Freq, fill = ProtectionStatus)) + '. Thanking you again. – R.P. Tamil Selvan Jan 08 '17 at 07:48
  • #1 `ifelse(Freq<1&Freq>0 & ProtectionStatus==0, percent(Freq), NA)`. #2 `df$ProtectionStatus <- factor(df$ProtectionStatus, levels = rev(levels(df$ProtectionStatus)))`. – lukeA Jan 08 '17 at 16:54