0

With this data frame I want to summarize the information in "inf" column extracting the % of "1" for each id level:

d <- "
    id        var iso exp rep pl inf nles 
    BM_1_1_1   B   M   1   1  1   1   19   
    BM_1_1_2   B   M   1   1  2   1   50
    BM_1_1_3   B   M   1   1  3   1   18
    BM_1_2_1   B   M   1   2  1   0    1
    BM_1_2_2   B   M   1   2  2   1   30
    BM_1_2_3   B   M   1   2  3   1   38
    BM_1_3_1   B   M   1   2  1   1    1
    BM_1_3_2   B   M   1   2  2   0    0
    BM_1_3_3   B   M   1   2  3   0    0 
"

d <- read.table(text=d, header = TRUE, check.names = F)

I would like to obtain this new aggregated data frame with the "new column"

id        var iso exp rep pl inf nles newcolumn
BM_1_1_1   B   M   1   1  1   1   19     100
BM_1_2_1   B   M   1   2  1   0    1      66
BM_1_3_1   B   M   1   2  1   0    1      33

Someone could help me with that? Thanks in advance!

Juanchi
  • 1,147
  • 2
  • 18
  • 36
  • 1
    have you tried searching ? This should help you to get what you want. http://stackoverflow.com/questions/18799901/data-frame-group-by-column. Yes, it's a comment and not anwer @jaap, deleted the answer. – user5249203 Feb 09 '16 at 17:24

1 Answers1

2

Your question, as its currently stated, does not reflect the desired output. That having been said, if you create a new_id column you can calculate your new_column via aggregate() and passing it a function. This works if your inf column is binary (i.e., strictly ones and zeros):

d$new_id <- substr(d$id, 1, 6)
d$new_column <- aggregate(data = d, inf ~ new_id, function(x) mean(x) * 100)
JasonAizkalns
  • 20,243
  • 8
  • 57
  • 116