I need to create a new yes/no (or 1/0) variable based on a group of existing ICD columns, whichever have specific values that meet the requirement. My current code is: inclusion %>% filter_at(vars("col1", "col2", "col3"), any_vars(. %in% c(49100, 49122, 48911, 404))). However, this will not help me generate the final yes/no variable. Any suggestions?
Asked
Active
Viewed 68 times
0
-
You should use the `dplyr` function `dplyr::mutate()`. With this function you can add/modify columns in your dataset. You cannot do this with `dplyr::filter_at()`. – van Nijnatten Mar 25 '21 at 16:33
-
hi jing. Can you add a reprex? this will increase the chances of getting a concrete answer? (see: https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) – Marcelo Avila Mar 25 '21 at 19:40
1 Answers
2
Instead of using filter_at
you could consider using mutate(across(...))
along with an ifelse
:
inclusion %>%
mutate(across(c(col1, col2, col3),
~ifelse(.x %in% c(49100, 49122, 48911, 404), TRUE, FALSE)))
This would override those columns. If you want new columns add a .names
argument as follows:
inclusion %>%
mutate(across(c(col1, col2, col3),
~ifelse(.x %in% c(49100, 49122, 48911, 404), TRUE, FALSE),
.names = "{col}_in_vec"))
If you want to have a single output for whether any of the values are included in any of the three columns, use c_across
:
inclusion %>%
rowwise() %>%
mutate(in_vec = any(c_across(c(col1, col2, col3)) %in% c(49100, 49122, 48911, 404)))

Will Hipson
- 366
- 2
- 9
-
It should retain all observations, you can swap the `TRUE` `FALSE` for `1` and `0` if you prefer, but `mutate` won't drop observations. – Will Hipson Mar 25 '21 at 16:50
-
All obs retained; but how can I find the new variable created? For example, if any of the three columns has a value of 49100, a new variable with value =1 is needed. – Lisa Mar 25 '21 at 17:39
-
You need to ensure that you are (1) assigning the output to an object using `<-` and (2) in `across` use the `.names` argument (as shown in the 2nd example) to create new columns with the desired output. – Will Hipson Mar 25 '21 at 17:44
-
I think I understand your problem now. Check the third solution and let me know if that works. – Will Hipson Mar 25 '21 at 17:48
-
I tried exact code"inclusion %>% rowwise() %>% mutate(in_vec = any(c_across(c(col1, col2, col3)) %in% c(49100, 49122, 48911, 404)))" and this is the error message: Error: `c_across()` must only be used inside dplyr verbs. – Lisa Mar 25 '21 at 17:58
-
If I run it on `mtcars` it seems to work: `mtcars %>% rowwise() %>% mutate(in_vec = any(c_across(c(mpg, cyl, disp)) %in% c(21, 6, 160)))` You could instead pipe the result from the 2nd solution into another `mutate` which looks like `mutate(in_vec = ifelse(any(col1, col2, col3), 1, 0))` – Will Hipson Mar 25 '21 at 18:01
-
-
I just noticed if you're using the second solution make sure you use the new variable names, so it would be `mutate(in_vec = ifelse(any(col1_in_vec, col2_in_vec, col3_in_vec), 1, 0))` – Will Hipson Mar 25 '21 at 18:10