Here is one option using rowMeans
within the dplyr
. We select
the columns from 'Responsiveness' to (:
) 'Translation', mutate
the dataset to create the column 'avg' with rowMeans
, specifying the na.rm=TRUE
to remove the NA
values, and cbind
(bind_cols
) with the remaining columns in the original dataset by subsetting the original dataset with columns that are not found in the mutate
d dataset i.e. .
. We can use setdiff
to get the column names.
library(dplyr)
df %>%
select(Responsiveness:Translation) %>%
mutate(avg= rowMeans(., na.rm=TRUE)) %>%
bind_cols(df[setdiff(names(df), names(.))] , .)
But, doing rowMeans
can be done without using any external package. In base R
, we match
the columns 'Responsiveness', 'Translation' with the column names of original dataset. This gives the numeric index of those columns. We can get the sequence (:
) from 'start' (i1[1]
), 'end' (i1[2]
), and use rowMeans
on the subset
dataset.
i1 <- match( c('Responsiveness', 'Translation'), names(df))
df$avg <- rowMeans(df[i1[1]:i1[2]], na.rm=TRUE)
We can also remove some steps in the above dplyr
code if we are using 'i1'
df %>%
mutate(avg= rowMeans(.[i1[1]:i1[2]], na.rm=TRUE))
NOTE: I am using dplyr_0.4.1.9000
on R 3.2.1
. When there are no NA
values, the OP's code is giving the same output as the rowMeans
. But, if there is an NA
value, I get a different value i.e. for the 2nd row in the example, I get 3.5
instead of 3.66667
. Though, I am not getting any error.
data
set.seed(24)
df <- data.frame(V1=1:10, Responsiveness=1:10, V2= c(2, NA, 4:11),
V3=3:12, Translation=4:13)