0

I wish to write a loop which removes all values from each year column, one at a time, if the value specified in Start Year is greater than that in the named year column.

X <- X %>%
  mutate(`2017` = ifelse(as.numeric(`Start Year`) > 2017, 0, `2017`)) %>%
  mutate(`2018` = ifelse(as.numeric(`Start Year`) > 2018, 0, `2018`)) 

I need to repeat this for multiple years but am unsure how to reference the columns named 2017, 2018 etc. in a loop. Thanks in advance.

j_abrams
  • 21
  • 1
  • It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. There may be better ways than using a bunch of `mutate()` statements. – MrFlick Mar 04 '20 at 15:55
  • You could use filter directly here : `X %>% filter_at(years_vector_variable, ~as.numeric(`Start Year`) > .) – cbo Mar 04 '20 at 15:56

1 Answers1

0

I would suggest to do it the tidy way. (; First make you df tidy by gathering all year columns like so:

df <- dplyr::tibble(
  x = runif(10),
  y = runif(10),
  "2017" = runif(10),
  "2018" = runif(10),
  "2019" = runif(10)
)

df <- df %>% 
  gather(year, value, "2017":"2019")

Then filter on the start_year

start_year <- 2018

df %>% 
  filter(year >= start_year)

or replace values for all other years with zeros as your code suggests:

df %>% 
    mutate(value = ifelse(year >= start_year, value, 0))
stefan
  • 90,330
  • 6
  • 25
  • 51