0

I have a data frame with the following structure:

 x <- data.frame("id" = c(1,1,1,2,2), "v1" = c("NB","MTA","MTA","RN","CANC"), "v2" = c(1,2,2,10,9))

What I want to do is create a new column in this dataframe which is based on the values of the "v1" column. The column should check for each unique id whether the "v1" column has a "NB" or a "RN" value. If it does, the column should have that value. The result should be:

>  result <- data.frame("id" = c(1,1,1,2,2), "v1" = c("NB","MTA","MTA","RN","CANC"), "v2" = c(1,2,2,10,9), "v3" = c("NB","NB","NB","RN","RN"))
> result
  id   v1 v2 v3
1  1   NB  1 NB
2  1  MTA  2 NB
3  1  MTA  2 NB
4  2   RN 10 RN
5  2 CANC  9 RN

I've been messing around with group-by in dplyr but cna't get it to work

user33484
  • 740
  • 2
  • 9
  • 36

2 Answers2

0

You can use ave to find if NB or RN exists per group like:

x$V3  <- ave(x$v1, x$id, FUN=function(x) c("NB", "RN")[min(match(x, c("NB", "RN")), na.rm=TRUE)])
x
#  id   v1 v2 V3
#1  1   NB  1 NB
#2  1  MTA  2 NB
#3  1  MTA  2 NB
#4  2   RN 10 RN
#5  2 CANC  9 RN

Or with a new function, returning NA if nothing was found:

f  <- (function(y) {
  force(y)
  function(x) {
    y[min(c(3, match(x, y)), na.rm=TRUE)]
  }
})(c("NB", "RN", NA))
x$V3  <- ave(x$v1, x$id, FUN=f)
GKi
  • 37,245
  • 2
  • 26
  • 48
0

With dplyr, you can do:

df %>%
 group_by(id) %>%
 mutate(v3 = if_else(any(v1 == "NB"), "NB", "RN"))

     id v1       v2 v3   
  <dbl> <fct> <dbl> <chr>
1     1 NB        1 NB   
2     1 MTA       2 NB   
3     1 MTA       2 NB   
4     2 RN       10 RN   
5     2 CANC      9 RN

Or:

df %>%
 group_by(id) %>%
 mutate(v3 = factor(+(any(v1 == "NB")), 
                    levels = c(1, 0), 
                    labels = c("NB", "RN")))
tmfmnk
  • 38,881
  • 4
  • 47
  • 67