Create dummy variable based on conditions in multiple other columns

Question

I am looking for help in adding a dummy variable to an existing dataframe based on conditions in multiple columns (this last bit is what separates my question from the answers I already found).

Here's a simple example:

y <- c(1,2,5,2,3,3)
z <- c("A", "B", "B", "A", "A", "B")
df <- as.data.frame(y,z)

Now I'd like to have a third column, which takes the value '1' if y is equal to 2 or if z is equal to B. So the column would show a value of 1 for all observations except the first (A,1) and the fifth (A,3).

I'm sure I know all the ingredients for doing this, I just cannot put it together right now. Any help would be much appreciated!

Try `df$z <- with(df, +(y == 2|z == "B"))` – akrun Jul 19 '22 at 17:18 — akrun, Jul 19 '22 at 17:18

score 2 · Answer 1 · answered Jul 19 '22 at 17:23

2

dplyr option using case_when:

y <- c(1,2,5,2,3,3)
z <- c("A", "B", "B", "A", "A", "B")
df <- data.frame(y = y, z = z)

library(dplyr)
df %>%
  mutate(dummy = case_when(y == 2|z == "B"~1,
                           TRUE ~ 0))
#>   y z dummy
#> 1 1 A     0
#> 2 2 B     1
#> 3 5 B     1
#> 4 2 A     1
#> 5 3 A     0
#> 6 3 B     1

^{Created on 2022-07-19 by the reprex package (v2.0.1)}

answered Jul 19 '22 at 17:23

Quinten

35,235
5
20
53

I got it to work, just one follow-up question. Can I add a condition whereby if either of the rows y or z are NA, the dummy also takes on NA? For my purpose, I could delete missing values prior to the operation, but I'd like to do it in a less brute way. – SpecialK201 Jul 20 '22 at 10:49

Create dummy variable based on conditions in multiple other columns

1 Answers1