-1

Trying to make a catch all variable - so if a respondent answers "yes" to at least one of the 9 yes/no variables, then they will be placed into the "yes" category in the overall variable.

I've done this by:

overallvariable <- ifelse(df$v1 == "yes" | df$v2 == "yes" | df$v3 == "yes" | df$v4 == "yes" |df$v5 == "yes" | df$v6 == "yes" | df$v7 == "yes" | df$v8 == "yes" | df$v9 == "yes", "yes", "no")

However the table(overallvariable) comes up with:

no
##

instead of

yes no
### ##

Thank you for your help!

Note: everything seems to work until I add v9 Note: Just played around with where v9 goes in, it doesn't seem to be a problem attached to the variable as it produces the output I needed. So it seems to be an issue with adding a ninth condition.

Heshani
  • 13
  • 3
  • 2
    Please provide parts of your data with `dput(head(yourdata))`. – MKR Nov 02 '21 at 07:10
  • Sorry, I am very new to R, what do you mean by this? – Heshani Nov 02 '21 at 07:12
  • 2
    You have the best chance of getting an answer if you provide a minimal example. You can paste part of your data into the question: `dput(head(df))`. Then people can easily use your data to get to the answer. – MKR Nov 02 '21 at 07:13
  • 1
    See [how to create a reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). How is `v9` different from the others? – MrFlick Nov 02 '21 at 07:15
  • Unfortunately the data I'm using is bound by ethics and protocols that do not allow me to share :/ – Heshani Nov 02 '21 at 07:17
  • V9 is very similar to the rest (most similar to V8) just a change in gender really e.g. Have you experienced X by female vs Have you experienced X by male – Heshani Nov 02 '21 at 07:18
  • Let me explain: You don't need to share your actual data. What you need to do is to provide an example with data that reproduces your error. People can help, as long as they can reproduce the error. But for now, we don't know what the data looks like. – MKR Nov 02 '21 at 07:21
  • Note: Just played around with where v9 goes in, it doesn't seem to be a problem attached to the variable as it produces the output i needed. So it seems to be an issue with adding a ninth condition. – Heshani Nov 02 '21 at 07:23
  • Well, this doesn't make any sense in terms of the code. This makes it even clearer that data is needed to reproduce the error. Like the question stands now, I can only offer, that you check if v9 is `character` and to check if there are no spelling mistakes or `NA`s in v9. – MKR Nov 02 '21 at 07:34
  • I think the NA's are the problem! Thank you – Heshani Nov 02 '21 at 07:36

3 Answers3

1

here is a data.table approach, and also an instriction how to create some sample data ;-)

sample data

set.seed(123)
mydata <- data.frame(id = 1:15,
                     v1 = sample(c("yes", "no"), 15, replace = TRUE),
                     v2 = sample(c("yes", "no"), 15, replace = TRUE),
                     v3 = sample(c("yes", "no"), 15, replace = TRUE),
                     v4 = sample(c("yes", "no"), 15, replace = TRUE))

code

library(data.table)
# convert to data.table formast
setDT(mydata)
# columns to look in
cols <- grep("v[1-4]", names(mydata), value = TRUE)
# initialise overallvariable to "no"
mydata[, overallvariable := "no"]
# if 1 or more columns in cols have the value "yes", set overallvariable to "yes"
mydata[ rowSums(mydata[, ..cols] == "yes", na.rm = TRUE) >= 1, 
        overallvariable := "yes"]

output

# id  v1  v2  v3  v4 overallvariable
# 1:  1 yes yes yes yes             yes
# 2:  2 yes  no  no yes             yes
# 3:  3 yes yes yes  no             yes
# 4:  4  no yes  no yes             yes
# 5:  5 yes yes  no yes             yes
# 6:  6  no yes yes  no             yes
# 7:  7  no  no yes yes             yes
# 8:  8  no yes yes yes             yes
# 9:  9 yes yes yes yes             yes
#10: 10 yes yes  no yes             yes
#11: 11  no yes yes  no             yes
#12: 12  no  no  no  no              no
#13: 13  no  no  no yes             yes
#14: 14 yes yes yes  no             yes
#15: 15  no  no yes yes             yes
Wimpel
  • 26,031
  • 1
  • 20
  • 37
1

Base R:

df$overallvariable <- c('no','yes')[1 + (rowSums(df == "yes") > 0)]

data:

df <- structure(list(V1 = c("no", "no", "no", "no", "no", "no", "no", 
"no", "no", "no"), V2 = c("yes", "yes", "yes", "no", "no", "no", 
"yes", "no", "yes", "yes"), V3 = c("yes", "yes", "yes", "no", 
"no", "no", "yes", "no", "yes", "yes"), V4 = c("no", "no", "no", 
"no", "no", "no", "no", "no", "no", "no"), V5 = c("yes", "yes", 
"yes", "no", "no", "no", "yes", "no", "yes", "yes"), V6 = c("no", 
"no", "no", "no", "no", "no", "no", "no", "no", "no"), V7 = c("no", 
"no", "no", "no", "no", "no", "no", "no", "no", "no"), V8 = c("yes", 
"yes", "yes", "no", "no", "no", "yes", "no", "yes", "yes"), V9 = c("no", 
"no", "no", "no", "no", "no", "no", "no", "no", "no"), V10 = c("no", 
"no", "no", "no", "no", "no", "no", "no", "no", "no")), class = "data.frame", row.names = c(NA, 
-10L))
TarJae
  • 72,363
  • 6
  • 19
  • 66
0

The dplyr package has the perfect function for the desired transformation: if_any.

library(dplyr)

df %>% mutate(overallvariable = if_any(V1:V10, ~ .x=='yes') %>% ifelse('yes', 'no'))

We can also use purrr::reduce

library(purrr)
library(dplyr)

df %>% mutate(overallvariable = reduce(across(V1:V10, ~.x=='yes'), `|`) %>% ifelse('yes', 'no'))

output using the data from @TarJae:

   V1  V2  V3 V4  V5 V6 V7  V8 V9 V10 overallvariable
1  no yes yes no yes no no yes no  no             yes
2  no yes yes no yes no no yes no  no             yes
3  no yes yes no yes no no yes no  no             yes
4  no  no  no no  no no no  no no  no              no
5  no  no  no no  no no no  no no  no              no
6  no  no  no no  no no no  no no  no              no
7  no yes yes no yes no no yes no  no             yes
8  no  no  no no  no no no  no no  no              no
9  no yes yes no yes no no yes no  no             yes
10 no yes yes no yes no no yes no  no             yes
GuedesBF
  • 8,409
  • 5
  • 19
  • 37