1

I work on survey data and build indexes by combining answers from different questions.

I want to create a new variable scoring the average of answers to 3 questions (which are on a 1 to 10 scale) when answers are minimum 2. Else I want my newvar to score 0. All this under the condition that answer are all different from NA.

Here is an example of what i try to obtain

DF <-
  data.frame(matrix(
    c(1, 3, 4, 9, 4, 3, 10, 4, 6, 4, 7, NA),
    nrow = 4,
    ncol = 3
  ))

#    X1 X2 X3 
# 1  1  4  6 
# 2  3  3  4 
# 3  4 10  7  
# 4  9  4 NA  

I would like to create DF$new_var_ave such as

#    X1 X2 X3 new_var_ave
# 1  1  4  6  0
# 2  3  3  4  3.3
# 3  4 10  7  7 
# 4  9  4 NA  NA

I have tried :
```R
DF$new_var_ave <-
      apply(DB_W[, c("X1", "X2", "X3")], 1, function(x) {
        ifelse(any(is.na(x)), NA, ifelse(all(x > 2), mean, 0))})`

However it fails and I received error text:

Error in rep(yes, length.out = len) : 
      attempt to replicate an object of type 'closure'

Thanks a lot for your help and suggestion.

jamman
  • 11
  • 2

2 Answers2

0

This solution uses rowSums to check whether there are values < 2 and rowMeans to calculate the rowwise means inside an ifelse statement:

DF$new_var_ave <- ifelse(rowSums(DF < 2) > 0, 0, round(rowMeans(DF),1))
DF
  X1 X2 X3 new_var_ave
1  1  4  6         0.0
2  3  3  4         3.3
3  4 10  7         7.0
4  9  4 NA          NA
Chris Ruehlemann
  • 20,321
  • 4
  • 12
  • 34
0

Here is another way of counting your averages based on the criteria you set. I thought you might be interested:

library(purrr)
library(dplyr)
library(tidyr)

df <- tribble(
  ~var1, ~var2, ~var3,  ~new_var_ave,
    1,    4,   6,     0,
    3,    3,   4,     0,
    4,   10,   7,     7,
    9,   4,    NA,    NA,
  
) %>%
  select(-new_var_ave)


df %>%
  mutate(avg = pmap(list(var1, var2, var3), ~ ifelse(any(is.na(c(...))), as.double(NA), 
                                                   ifelse(all(c(...) > 2), mean(c(...)), 0))
  )) %>%
  unnest(cols = c(avg))

# A tibble: 4 x 4
   var1  var2  var3   avg
  <dbl> <dbl> <dbl> <dbl>
1     1     4     6  0   
2     3     3     4  3.33
3     4    10     7  7   
4     9     4    NA NA 

Anoushiravan R
  • 21,622
  • 3
  • 18
  • 41
  • Great, thanks for that @Anoushiravan R ! My and my collegues are working with Dataframe for this project. In case we do not solve our pb with df, i'll take your proposition ! – jamman Apr 02 '21 at 16:36
  • My pleasure, but please feel free to upvote if any answer here solve your problem. I would be glad to provide assistance any time. – Anoushiravan R Apr 02 '21 at 16:52
  • This is a tibble which is also a data frame and you only need to install the required packages I mentioned to be able to use it. It's your choice any way but it would be great to learn alternative solutions from different contributors here. – Anoushiravan R Apr 02 '21 at 16:53
  • You can use the following code if you want to use your `DT` sample. The only difference is the column names: `names(DF) <- c("var1", "var2", "var3")`. – Anoushiravan R Apr 02 '21 at 17:05