0

I have an excel file with data about the genders of different people. I am reading it with read.csv. I am using ggplot to plot a barplot of that data.

THis is the data:

> dput(dat.absolventen$Geschlecht)
structure(c(1L, 2L, 1L, 2L, 1L, 1L, 1L, 2L, 1L, 2L, 1L, 1L, 1L, 
1L, 1L, 1L, 2L, 1L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("männlich", 
"weiblich"), class = "factor")

And this is my code:

ggplot(data = d, aes(x=Geschlecht,y=(..count..)/sum(..count..))) +
  geom_bar(
            fill="steelblue"
           ) +
  geom_text(aes(label = sprintf("%0.1f%%",(..count..)/sum(..count..)*100)), 
            stat = "count", 
            colour = "white",
            vjust = +2,
            fontface = "bold"               
            )

This gives me a good enough graph. There are 9 females and 32 males. I can get it to show me the percentages inside each bar. However, I would like to show the percentage of the males inside the bar and the percentage of females outside or not top of the bar(basically if the bar is too short, move the label outside).

I know I can use ifelse() but I can't figure out how to use it for each value, that is, the total count of male and female. If I use ifelse() when defining aes(label =...) then the criteria is applied on the whole column. I want it to test each bar, that is, male and female, and then vjust it according to the criteria of less than or greater than, say, 15.

I have tried using

ifelse(..count.. >15, -2, +2)

but this gives me the error that '..count.. not found'. And I am not sure why it says that because while defining aesthetics I am using ..count.. and it works there.

There have been many similar questions asked before but I have been unable to get any help from them, which is why I have to ask again for this particular case. Regards.

detraveller
  • 285
  • 3
  • 17
  • 1
    can you add some data so others may reproduce your plot – Nate Jan 06 '19 at 16:19
  • Could you make your problem reproducible by sharing a sample of your data so others can help (please do not use `str()`, `head()` or screenshot)? You can use the [`reprex`](https://reprex.tidyverse.org/articles/articles/magic-reprex.html) and [`datapasta`](https://cran.r-project.org/web/packages/datapasta/vignettes/how-to-datapasta.html) packages to assist you with that. See also [Help me Help you](https://speakerdeck.com/jennybc/reprex-help-me-help-you?slide=5) & [How to make a great R reproducible example?](https://stackoverflow.com/q/5963269) – Tung Jan 06 '19 at 16:24
  • @Tung I am not sure how I can post the data here. It is a single column excel sheet with Column heading Geschlecht and 40 entries below it, 30 male and 10 female. – detraveller Jan 06 '19 at 16:38
  • Your dataframe is called d. In R, run `dput(d)` and paste the result in your question. – Joseph Clark McIntyre Jan 06 '19 at 16:46
  • @JosephClarkMcIntyre done – detraveller Jan 06 '19 at 16:55

1 Answers1

2

This issue, I think, is that you can only access ..count.. inside aes(). You can't pass that to vjust because it's not defined. Here is a hacky solution. Basically, I figure out the adjustment outside of ggplot based on a table of the grouping factor, call that adj, and give that info to ggplot.

vec <- structure(c(1L, 2L, 1L, 2L, 1L, 1L, 1L, 2L, 1L, 2L, 1L, 1L, 1L, 
            1L, 1L, 1L, 2L, 1L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
            1L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("männlich", 
                                                                        "weiblich"), class = "factor")

d <- data.frame(gender = vec)

nums <- table(d)
adj <- ifelse(nums > 15, 2, -2)

ggplot(data = d, aes(x=gender,y=(..count..)/sum(..count..))) +
  geom_bar(
    fill="steelblue"
  ) +
  geom_text(aes(label = sprintf("%0.1f%%",(..count..)/sum(..count..)*100)), 
            stat = "count", 
            colour = "black",
            vjust = adj,
            fontface = "bold"               
  )
Joseph Clark McIntyre
  • 1,094
  • 1
  • 6
  • 6
  • Thanks. I had been trying to do that already but couldn't get it to work for individual bars. It works now. I will now try to extend this to a general file where I can put the percentage out of the bar if the bar is too small. I can probably use a criteria where the count of the values is below a particular percentage of the total data. Any tips regarding that are welcome. – detraveller Jan 06 '19 at 17:13
  • 1
    Use `prop.table(table(vec))` to get a table of proportions. – Joseph Clark McIntyre Jan 06 '19 at 17:23