0

Here's my task:

Create the following graph:

  • Group the Pokemon data by type1 and is_legendary
  • Summarize the data with the mean of the attack
  • Create a bar chart with type1 on the x, the mean of the attack on the y and type one of the fill
  • Change the colors of the type one to be type_color
  • facet_wrap it so that there are different bar charts for the regular and legendary pokemon

I am able to summarize the data by the mean of the attack, but it is not a column in my dataset. How do I make it so the mean of the attack is the y aesthetic? Am I overlooking a simpler way?

I have tried making a new column on the original Pokemon dataset, but that would create a mean of all of the data and not group it by type1 or is_legendary.

I also keep getting the error: Error: stat_count() must not be used with a y aesthetic. But when I look up that error, I can't see how it applies to this particular issue.


pokemon$type1 <- factor(pokemon$type1)
pokemon$is_legendary <- factor(pokemon$is_legendary)

pokemon %>% 
  group_by(type1, is_legendary) %>%  
  summarize(mean_attack = mean(attack)) %>% 
  ggplot(mapping = aes(x = type1, y = mean_attack, fill = type1)) + geom_bar() 
+ scale_fill_manual(values = type_color) + facet_wrap(~ is_legendary) 
+ labs(title = "Average Attack of Legendary and Regular Pokemon") + pokemon.theme
dc37
  • 15,840
  • 4
  • 15
  • 32
user12554068
  • 31
  • 1
  • 4
  • One minor note, if you are piping summarized data into `ggplot()` you probably don't want to also refer to the original `pokemon` data frame as your data source. – Jon Spring Dec 18 '19 at 23:34
  • Use `geom_col` if you want to give a y-coordinate. `geom_bar` is for when you want ggplot to aggregate counts for you – camille Dec 18 '19 at 23:59
  • Here's [one](https://stackoverflow.com/q/39679057/5325862) of several SO posts you'll get by searching your error message. Beyond that, you can make this a [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) – camille Dec 19 '19 at 00:06

1 Answers1

1

As explained in the datasheet for geom_bar (https://ggplot2.tidyverse.org/reference/geom_bar.html):

geom_bar() uses stat_count() by default: it counts the number of cases at each x position. geom_col() uses stat_identity(): it leaves the data as is.

Here is an example of what you are trying to get (using iris dataset):

library(tidyverse)
iris %>% group_by(Species) %>% summarise(MeanSep = mean(Sepal.Length))

# A tibble: 3 x 2
  Species    MeanSep
  <fct>        <dbl>
1 setosa        5.01
2 versicolor    5.94
3 virginica     6.59

If you are trying to plot it using geom_bar, you get:

iris %>% group_by(Species) %>% summarise(MeanSep = mean(Sepal.Length)) %>% 
  ggplot(., aes(x = Species, y = MeanSep, fill = Species)) +
  geom_bar()

Error: stat_count() must not be used with a y aesthetic.

But, if you are trying using geom_col as mentioned by camille, or you use `geom_bar(stat = "identity") you get your plot:

iris %>% group_by(Species) %>% summarise(MeanSep = mean(Sepal.Length)) %>% 
  ggplot(., aes(x = Species, y = MeanSep, fill = Species)) +
  geom_col()

iris %>% group_by(Species) %>% summarise(MeanSep = mean(Sepal.Length)) %>% 
  ggplot(., aes(x = Species, y = MeanSep, fill = Species)) + 
  geom_bar(stat = "identity")

enter image description here

Hope it answer your question

dc37
  • 15,840
  • 4
  • 15
  • 32