0

I'm trying to make a bar graph with ten variables and when I enter in my code, I seem to get a weird graph that just shows the frequencies as 1.00. I'm not looking for frequencies, I'm looking for the counts that are already in my data frame. Here is my code so far.

library(dplyr)
library(tidyverse)

path <- file.path("~", "Desktop", "Police_Use_of_Force.csv")
invisible(Force <- read.csv(path, stringsAsFactors = FALSE))
invisible(ProblemDf <- Force %>%
              select(Problem))
ProblemDf[ProblemDf==""] <- NA
hi <- tibble(ProblemDf[rowSums(is.na(ProblemDf)) != ncol(ProblemDf), ])
names(hi) = "Problem"
topTen <- hi %>%
    count(Problem) %>%
    arrange(desc(n)) %>%
    top_n(10, n)
ggplot(topTen, aes(y = Problem)) + geom_bar()

and here is the graph that it produces. Bar Graph

Buchlord
  • 11
  • 2
  • Try `geom_bar(stat='identity')` – Duck Jul 26 '20 at 22:33
  • It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. What does `topTen` look like? Does it have a column for count? – MrFlick Jul 26 '20 at 22:45
  • I tried ```ggplot(topTen, aes(y = Problem)) + geom_bar(stat = 'identity')``` but it throws an error that says: "Error: geom_bar requires the following missing aesthetics: x" – Buchlord Jul 26 '20 at 22:45
  • ```topTen``` is a tbl_df with the categories in one column and the "n" counts on another, where the n column was created when I called ```count(Problem)``` – Buchlord Jul 26 '20 at 22:48
  • `ggplot(topTen, aes(x=n, y = Problem))` – Edward Jul 26 '20 at 22:53
  • That worked! Thank you so much!! – Buchlord Jul 27 '20 at 02:54

1 Answers1

0

The geom_bar() is essentially a univariate plot. It automatically counts the number of times each value appears for you. For example

ggplot(data.frame(vals=c("a","a","a","z","z")), aes(y=vals)) + geom_bar()

However in your case you are already calculating the counts so you are really making a bivariate plot. The correct geom for that is geom_col and you need to tell ggplot which column contains the counts. Use

ggplot(topTen, aes(y = Problem, x=n)) + geom_col()
ggplot(data.frame(vals=c("a","z"), n=c(3,2)), aes(y=vals, x=n)) + geom_col()
MrFlick
  • 195,160
  • 17
  • 277
  • 295