0

I am quite a beginner in R and I am stuck with this problem:

I have a dataset (data) of 24 columns. Each column is an answer to a question : "Is your xxx hurting"? xxx varies in each column (it can be back, head...) Each column can take 3 values: 1 if it does not hurt, 2 if it hurts a bit, 3 if it pains.

I want to create a new variable (fragile) to count each time a person said it hurted a bit somewhere.

data$fragile <- ifelse(data$var1 ==2 , 1, 0) + ifelse(data$var2 == 2, 1, 0) + ...

But it is quite long so I would like to not do it

I tried doing a loop but I read somewhere that R is not made to do loops (and my loop does not work anyway)

data$fragile <-  for(i in dataset[1:24]){ifelse(i == 2, 1, 0)}

And when I do head(data$fragile), "NULL" appears...

What should I do?

I am sure this question has already been asked several times but I do not find the right keywords.

Emeline
  • 161
  • 9
  • This could be help you https://stackoverflow.com/questions/22337394/dplyr-mutate-with-conditional-values In `R` when you want to create a new variable you can use the function `mutate` from `dplyr`. – Earl Mascetti Jun 11 '20 at 13:32

2 Answers2

1

You can use rowSums :

data$fragile <- +(rowSums(data == 2) > 0)

This will give 1 to fragile column if there is 2 in any of the column for a particular row.


In the new dplyr 1.0.0 we can use rowwise with cur_data().

library(dplyr)
data %>% rowwise() %>% mutate(fragile = +(any(cur_data() == 2)))
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
1

Given the assumption that you columns of interest have the indices 1:24:

data$fragile <- apply( data[1:24], 1, function(x) sum(x==2, na.rm = TRUE) )

If you have the columns in a different order just change the index.

This solution, however, is quite slow compared to rowSums.

Jan
  • 4,974
  • 3
  • 26
  • 43
  • Thank you so much! I saw that many rows of my new variable fragile were removed into missing values because I assume that in at least one of the other variables, they did not answer the question. How can I resolve this matter? – Emeline Jun 11 '20 at 13:29
  • Both `sum`and `rowSums` support an argument `na.rm`. If you set this to `TRUE`the functions will ignore any missing values. – Jan Jun 11 '20 at 13:32