1

I have a data frame with 3 different identifications and sometimes they overlap. I want to create a new column, with only one of those ids, in an order of preference (id1>id2>id3). Ex.:

id1  id2   id3
12   145   8763
45   836   5766
13   768   9374
     836   5766
12   145   
           9282
     567   
45   836   5766

and I want to have:

    id1  id2   id3    id.new
    12   145   8763   12
    45   836   5766   45
    13   768   9374   13
         836   5766   836
               9282   9282
         567          567

I have tried the if else,which, grep functions.. but I can't make it work.

Ex. of my try:

df$id1 <- ifelse(df$id1 == "", paste(df$2), (ifelse(df$id1)))

I am able to do this on Excel, but I am switching to R, for being more reliable and reproducible :) But in excel I would use:

=if(A1="",B1,(if(B1="",C1,B1)),A1)
10 Rep
  • 2,217
  • 7
  • 19
  • 33
bayz_isa
  • 13
  • 3
  • Previous, related discussion - https://stackoverflow.com/questions/19253820/how-to-implement-coalesce-efficiently-in-r – thelatemail May 25 '20 at 05:12

2 Answers2

2

Using coalesce from the dplyr package, we can try:

library(dplyr)
df$id.new <- coalesce(df$id1, df$id2, df$id3)
df

  id1 id2  id3 id.new
1  12 145 8763     12
2  45 836 5766     45
3  13 768 9374     13
4  NA 836 5766    836
5  12 145   NA     12
6  NA  NA 9282   9282
7  NA 567   NA    567
8  45 836 5766     45

Data:

df <- data.frame(id1=c(12,45,13,NA,12,NA,NA,45),
                 id2=c(145,836,768,836,145,NA,567,836),
                 id3=c(8763,5766,9374,5766,NA,9282,NA,5766))
Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360
  • thank you SO much! I've been working on this for the past 2 days.. Also, to remove duplicates, can I just use after: ```df<- id.new[!duplicated(df$id.new),]``` ? – bayz_isa May 25 '20 at 04:24
0

In base you can use apply of is.na(df) with function which.min to get a matrix used for subsetting. Thanks to @tim-biegeleisen for the dataset.

df$id.new <- df[cbind(1:nrow(df), apply(is.na(df), 1, which.min))]
df
#  id1 id2  id3 id.new
#1  12 145 8763     12
#2  45 836 5766     45
#3  13 768 9374     13
#4  NA 836 5766    836
#5  12 145   NA     12
#6  NA  NA 9282   9282
#7  NA 567   NA    567
#8  45 836 5766     45
GKi
  • 37,245
  • 2
  • 26
  • 48