-1

I have a df with answers to survey questions, where df$Q57 is one of five answers:

  1. "" (<- blank is basically NA)
  2. I would never do this
  3. I will do this in five years
  4. I will do this in 10 years
  5. I will do this eventually

I want to create a dummy variable where:

  1. "" = NA
  2. I would never do this = 0
  3. I will do this in five years = 1
  4. I will do this in 10 years = 1
  5. I will do this eventually = 1

The best way I know how to do this is with a series of ifelse commands:

df$Q57_dummy <- ifelse(df$Q57 == "I would never install water control structures", 0, 1)
df$Q57_dummy <- ifelse(df$Q57 == "", NA, df$Q57_dummy)
table(df$Q57_dummy , useNA = "always")

This works, but I feel like there are cleaner ways to do this, and I was wondering if anyone had suggestions, because I will have to recode survey answers that have more than 1,0,NA outcomes. Thanks!

tchoup
  • 971
  • 4
  • 11
  • You might want to consider using named vectors as lookup tables. There are quite a few examples on SO. The data for the named vector could be stored neatly somewhere (ie not in your R source code) – SmokeyShakers Oct 25 '21 at 14:33
  • 1
    Does this answer your question? [case\_when in mutate pipe](https://stackoverflow.com/questions/38649533/case-when-in-mutate-pipe) – KoenV Oct 25 '21 at 14:38

1 Answers1

2

tidyverse approach:

df %>%
    mutate(Q57_dummy = case_when(
        Q57 == "" ~ NA,
        Q57 == "I would never do this" ~ FALSE,
        TRUE ~ TRUE # this is the else condition
    ))

You could take a few different approaches with the else condition depending on how you prefer your code style. The above works, but you could also do this with stringr:

str_detect(Q57, "I will do this") ~ TRUE

or manually input the options:

Q57 %in% c("I will do this in five years",...) ~ TRUE
geoff
  • 942
  • 5
  • 13