0

I have a dataset that includes individual's Ages and Amount Spent in groceries. I need to be able to group those individuals by age, in sets of 10 years, and find the average amount spent in groceries, and do a bar graph that illustrates the average amount spent by age range.

I have tried watching some videos on YouTube and reading some similar questions, but can't find one that really solves my problem.

MrFlick
  • 195,160
  • 17
  • 277
  • 295
  • 2
    It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. – MrFlick Nov 04 '22 at 15:18
  • 2nd advice of MrFlick. Also suggest you explore the `dplyr` package of the tidyverse: https://dplyr.tidyverse.org/ Here are links to a more compact presentation of the `dplyr` functions including the very powerful piping operator: https://courses.cs.ut.ee/MTAT.03.183/2017_fall/uploads/Main/dplyr.html https://cran.r-project.org/web/packages/dplyr/vignettes/dplyr.html I'm guessing that you will need to add an auxiliary column that contains the decade for each age. The `cut` function can help with that: https://r-coder.com/cut-r/ `cut` can be embedded in `dplyr` `mutate` function. – SteveM Nov 04 '22 at 17:42
  • Greetings! Usually it is helpful to provide a minimally reproducible dataset for questions here so people can troubleshoot your problems (rather than a table or screenshot for example). One way of doing is by using the `dput` function on the data or a subset of the data you are using, then pasting the output into your question. You can find out how to use it here: https://youtu.be/3EID3P1oisg – Shawn Hemelstrand Nov 17 '22 at 01:18

1 Answers1

0

Something like this should work, with initial variables 'age' and 'amount_spent' :

library(dplyr)

means <- dataset %>%
  mutate(age_in_10s = case_when(
    between(age, 0, 10) ~ "0-10",
    between(age, 11, 20) ~ "11-20",
    between(age, 21, 30) ~ "21-30",
    TRUE ~ "other"
  )) %>%
  group_by(age_in_10s) %>%
  summarise(average_amount = mean(amount_spent)) %>%
  ungroup()

means %>%
  ggplot2::ggplot(aes(x = age_in_10s, y = average_amount)) +
  geom_bar(stat = "identity")
lilblue
  • 84
  • 7
  • I tried it and it gives me an error Error in `mutate()`: ! Problem while computing `age_in_10s = case_when(...)`. Caused by error in `between()`: ! argument "right" is missing, with no default Run `rlang::last_error()` to see where the error occurred. – liorse213 Nov 04 '22 at 20:41