R calculated column result of aggregating data over groups

Asked Mar 17 '16 at 13:52

Active Mar 17 '16 at 13:52

Viewed 30 times

I have a data frame which looks like this:

dput(test.df)
structure(c("a", "a", "b", "c", "d", "1", "2", "3", "3", "4"), .Dim = c(5L, 
2L), .Dimnames = list(NULL, c("session_", "vid_")))

what I need to do in R is to check which sessions have more than 1 vid_ , then will be marked as TRUE in a new column, so the results looks like this:

> dput(results.df)
structure(c("a", "a", "b", "c", "d", "1", "2", "3", "3", "4", 
"TRUE", "TRUE", "FALSE", "FALSE", "FALSE"), .Dim = c(5L, 3L), .Dimnames = list(
    NULL, c("session_", "vid_", "dirty_session")))

I want to do it in a clean way in 1 row of code, and not group by and count vid's into a seperate dataset and join it back to the original set any help appreciated

asked Mar 17 '16 at 13:52

Nir Regev

Try `ave(test.df[, 2], test.df[, 1], FUN = function(x) length(unique(x))) > 1` – David Arenburg Mar 17 '16 at 13:59
You also probably shouldn't use mixed column classes in a `matrix`. Take a look into `?data.frame`. – David Arenburg Mar 17 '16 at 14:09

R calculated column result of aggregating data over groups

0 Answers0