Mean of Multiple Columns in R

Question

I am trying take the mean of a list of columns in R and am running into a issue. Let's say I have:

What I am trying to do is take the mean of columns c(A,C) and save it as a value say (E) as well as the mean of columns c(B,D) and have it save as a different value say F. Is that possible?

score 3 · Accepted Answer · answered Jan 31 '17 at 20:04

3

Check out dplyr:

library(dplyr)
df <- df %>% mutate(E=(A+C)/2, F=(B+D)/2)
df

  A  B  C  D  E  F
1 1  2  3  4  2  3
2 5  6  7  8  6  7
3 9 10 11 12 10 11

answered Jan 31 '17 at 20:04

thc

9,527
1
24
39

akrun · Answer 2 · 2017-01-31T16:35:54.947

2

We can subset the dataset with columns 1 & 2, another one with 3 & 4, add them together, divide by 2, and change the column names with setNames

setNames((df1[1:2] + df1[3:4])/2, c("E", "F"))
#   E  F
#1  2  3
#2  6  7
#3 10 11

Or another option is rowMeans by keeping it in a list using the recycling logical vector, loop through the list (using sapply) and get the rowMeans

i1 <- c(TRUE, FALSE)
sapply(list(df1[i1], df1[!i1]), rowMeans)

Or another option is unlist the dataset, convert it to array and use apply to get the mean

apply(array(unlist(df1), c(3, 2, 2)), c(1,2), mean)

edited Jan 31 '17 at 16:35

answered Jan 31 '17 at 16:30

akrun

874,273
37
540
662

If I have a list of columns, say a1,a2,a3,b1,b2,b3,c1,c2,c3 and wanted to take the mean score of columns a1,a2,a3 into a column and likewise with b and c to have a mean of a,b,c into a single column for each letter, could I do the same? – Dante Smith Jan 31 '17 at 16:36
1

@DanteSmith In that case do `sapply(split.default(df1, sub("\\d+", "", colnames(df1))), rowMeans)` – akrun Jan 31 '17 at 16:38
Where does the \\d+ come into play? Is that splitting the column by number? – Dante Smith Jan 31 '17 at 16:40
1

@DanteSmith It is based on the column names you showed. I see that you have numbers that follow a, b, c etc. We are removing that with `sub` and splitting the dataset columns with prefix 'a', 'b', 'c' – akrun Jan 31 '17 at 16:41
Thanks for the quick responses. So if the names of the columns were say, alow, amedium,ahigh and blow,bmedium, bhigh as opposed to numerics could you have it take the mean of every 3 columns to produce a= mean(alow,amedium,ahigh) ? – Dante Smith Jan 31 '17 at 16:46
@DanteSmith Sorry, didn't see your comment. In that case, I would use `sapply(split.default(df1, substr(colnames(df1), 1, 1)), rowMeans)` – akrun Jan 31 '17 at 17:01
@DanteSmith BTW, what kind of patterns you have in the original dataset. If we keep on changing the patterns, it will take a lot of time to solve – akrun Jan 31 '17 at 17:02
@akrun Hi, I am going to do something like this question but instead, I want to find the mean of every 10 columns of my data (which has 1000 columns and some NA data) how should I do it?Can you please guide me?Thanks :) – Shalen May 07 '20 at 17:58
@akrunsure, thanks ... I thought maybe it is too basic ;) – Shalen May 07 '20 at 18:45
1

@akrun I did, didn't I?! – Shalen May 07 '20 at 18:57

Mean of Multiple Columns in R

2 Answers2

Linked