0

I am trying to use a simple dplyr command to group World Bank data and then add all of the country populations for a given year into a new column. No matter what I seem to do I can't get anything except a column of NAs. I've even tried base R aggregate() and nothing seems to work? Not entirely sure what is going on--I converted all of the World Bank data to numeric data but nothing is coming out the other side.

The function I'm using is very similar to this: How to create a new variable that is the sum of a column, by group, in R?

df %>%
  group_by(group) %>%
  mutate(sum = sum(variable)) %>%
  ungroup()

Thank you!

Edit: also, the code works on any other dataset, so I'm not sure what I did wrong.

  • 2
    Do you have NAs in the variable column? Trying sum(variable, na.rm=TRUE) will probably get you over one hurdle but being aware of where those NAs come from is usually important – Dubukay Jan 13 '22 at 23:25

1 Answers1

1

Have you got NA's in your variable column?

If so, try this to get the sum ignoring missing values:

df %>%
  group_by(group) %>%
  mutate(sum = sum(variable, na.rm=TRUE)) %>%
  ungroup()
rw2
  • 1,549
  • 1
  • 11
  • 20