-1

I have a dataset looking like the following. For this datset I have a function, which creates a value for a column. My question is how to apply this function for the coulmns 2 to 2536 and afterward take the mean of all results.

ids V1 V2 V3 V4 ......
12  1  1  2  NA
13  2  1  3  1
18  NA 2  3  3
19  1  1  NA 1

AI <- function(AI) {
  ((sort(table(AI),decreasing=TRUE)[1])-0.5*
     (sum(!is.na(AI))
      - (sort(table(AI),decreasing=TRUE)[1]))) /sum(!is.na(AI))
}
  • `mean` for each column or for all columns `2:2536`? – jyjek Dec 20 '18 at 12:35
  • Which manual did you try to get an answer from and failed? Please edit question and show that type of effort. If there are questions on Stackoverflow that show similar link that and explain a little why its not working for you. – ZF007 Dec 20 '18 at 13:27

2 Answers2

0

Something like this?

library(tidyverse)

df=read_table("ids V1 V2 V3 V4 
12  1  1  2  NA
13  2  1  3  1
18  NA 2  3  3
19  1  1  NA 1")
df %>% 
  select(contains('V')) %>% 
  mutate_at(vars(contains('V')),funs( (.-0.5*sum(.,na.rm = T))/sum(.,na.rm = T) )) %>% 
  replace(is.na(.),0) %>% as.matrix() %>% 
  mean
jyjek
  • 2,627
  • 11
  • 23
0

First, build your function:

 my_func <- function(x) x*2

Then use dplyr library:

library(dplyr)         # a part  of tidyverse
df %>% 
  mutate_at( vars(2:5), my_func ) %>% # apply my_func to columns 2 to 5
  summarise_all( mean, na.rm = T)     # apply mean to all columns

#   ids       V1  V2       V3       V4
#  15.5 2.666667 2.5 5.333333 3.333333

Hope it helps!

Chuck Ramirez
  • 245
  • 1
  • 12