How to apply a function for multiple columns and generate the mean afterwards?

Question

I have a dataset looking like the following. For this datset I have a function, which creates a value for a column. My question is how to apply this function for the coulmns 2 to 2536 and afterward take the mean of all results.

ids V1 V2 V3 V4 ......
12  1  1  2  NA
13  2  1  3  1
18  NA 2  3  3
19  1  1  NA 1

AI <- function(AI) {
  ((sort(table(AI),decreasing=TRUE)[1])-0.5*
     (sum(!is.na(AI))
      - (sort(table(AI),decreasing=TRUE)[1]))) /sum(!is.na(AI))
}

Which manual did you try to get an answer from and failed? Please edit question and show that type of effort. If there are questions on Stackoverflow that show similar link that and explain a little why its not working for you. — ZF007, Dec 20 '18 at 13:27

score 0 · Answer 1 · answered Dec 20 '18 at 12:38

Something like this?

library(tidyverse)

df=read_table("ids V1 V2 V3 V4 
12  1  1  2  NA
13  2  1  3  1
18  NA 2  3  3
19  1  1  NA 1")
df %>% 
  select(contains('V')) %>% 
  mutate_at(vars(contains('V')),funs( (.-0.5*sum(.,na.rm = T))/sum(.,na.rm = T) )) %>% 
  replace(is.na(.),0) %>% as.matrix() %>% 
  mean

score 0 · Answer 2 · answered Dec 20 '18 at 13:08

First, build your function:

 my_func <- function(x) x*2

Then use dplyr library:

library(dplyr)         # a part  of tidyverse
df %>% 
  mutate_at( vars(2:5), my_func ) %>% # apply my_func to columns 2 to 5
  summarise_all( mean, na.rm = T)     # apply mean to all columns

#   ids       V1  V2       V3       V4
#  15.5 2.666667 2.5 5.333333 3.333333

Hope it helps!

How to apply a function for multiple columns and generate the mean afterwards?

2 Answers2