2

I have tried everything to convert the bar chart I made here from COUNT on the y axis to PERCENT OF TOTAL (N=142) on the y axis, but can't seem to figure it out. I would like the x-axis to be the columns "Spatial_Management", "Landing_ban", and "Bycatch_rentention", and the y-axis to be percentage of policies that have a 1 value for this column. And the fill to be "Strength". I think I need to make a very simple edit my data beforehand, I have tried this below but it's not working.

EDIT: sample dataframe:

    df<- data.frame(policy=c("Policy A", "Policy B", "Policy C", "Policy D", 
                     "Policy E","Policy F" ),
            Spatial_Management= c(0,1,1,0, 0,1),
            Landing_ban= c(0,1,1,0, 0,1),
            Bycatch_Retention= c(0,1,1,0, 0,1),
            Strength=c("M", "V", "M", "P", "P", "M"),
            stringsAsFactors=FALSE)

My current figure code is:

df %>% 
  pivot_longer(Spatial_management:Bycatch_Retention) 
  filter(value==1) %>%
  ggplot(aes(x=factor(name, level=level_order), fill = factor(Strength)) +
                       y = (..count..)/sum(..count..)) +
 geom_bar()+
 stat_bin(geom = "text",
       aes(label = paste(round((..count..)/sum(..count..)*100), "%")),
       vjust = 5) +
 scale_y_continuous(labels = percent)

I know this is very simple, but would appreciate any help!!!

Alyssa C
  • 79
  • 8
  • 1
    I would experiment with summarizing the data beforehand and then using `geom_col`, as shown in the example below. To me that's a lot easier to control and understand than the `..count..` calls within the ggplot. – cardinal40 Feb 24 '20 at 19:01

1 Answers1

2

Here, you need to reshape your dataframe into a longer format and then to count for the number of values divided by the number of policies (here it is equal to the number of rows of you dataframe):

library(tidyr)
library(dplyr)
library(ggplot2)
df %>% pivot_longer(-c(policy, Strength), names_to = "var", values_to = "val") %>%
  group_by(Strength, var) %>%
  summarise(Val = sum(val)/ nrow(df)) %>%
  ggplot(aes(x = var, y = Val, fill = Strength))+
  geom_col()+
  scale_y_continuous(labels = percent)

enter image description here

dc37
  • 15,840
  • 4
  • 15
  • 32
  • Yes! in this case though what is meant by ID? This is meant to be 5 columns, do I make a new object called "ID" that contains those columns? – Alyssa C Feb 24 '20 at 19:04
  • 1
    So, you want ID as x-axis and y will the sum of each column ? What about `stength` ? Can you precise a little bit what kind of bargraph you are trying to achieve ? – dc37 Feb 24 '20 at 19:05
  • Yes, as you did above I would like those columns, "Bycatch_retention", etc on x axis. I made edits and included a clearer sample dataset above. When I try your code, I receive "Error in -x : invalid argument to unary operator" -- I think this is because I need to edit the data itself? – Alyssa C Feb 24 '20 at 20:00
  • 1
    I edited my answer accordingly. Let me know if it works for you. – dc37 Feb 24 '20 at 20:02
  • YES! it works! wow, thank you! That was a long time coming. I'm trying to now order them descending, which I think i can figure our via level ordering. but if you have ideas I'd love to hear. THANK YOU! :) – Alyssa C Feb 24 '20 at 20:34
  • 1
    I think you need to order based on the value first and then to fix factor levels in this order. There is a lot of post here that I addressed this issue – dc37 Feb 24 '20 at 20:38