1

I would like to solve the following problem with R:

I have a data table with values from two labs: lab1 and lab2. It could be that lab1 or lab2 contains NA values so then the column "lab3" should contain the non-NA value. If lab1 and lab2 contain NAs of course lab3 is also NA, but if lab1 and lab2 contain values, lab3 should apply the value of lab2 (this is the "more important" lab).

Example:

lab1     lab2     lab3
5        7        7       < lab 3 contains the value of lab 2 (because more important)
8        10       10      < lab 3 contains the value of lab 2 (because more important)
NA       3        3       < lab 3 contains the value of lab 2, because lab 1 is NA
9        NA       9       < lab 3 contains the value of lab 1, because lab 2 is NA
NA       NA       NA      < lab 3 contains NA, because lab 1 and lab 2 contain NA

I found the function coalesce (dplyr) but I was not able to define the primary importance of lab2 (over lab1).

Thanks for any help!

ThomasIsCoding
  • 96,636
  • 9
  • 24
  • 81
MDStat
  • 355
  • 2
  • 17

2 Answers2

2

You can try fcoalesce if you are working with data.table

> setDT(df)[, lab3 := fcoalesce(lab2, lab1)][]
   lab1 lab2 lab3
1:    5    7    7
2:    8   10   10
3:   NA    3    3
4:    9   NA    9
5:   NA   NA   NA
ThomasIsCoding
  • 96,636
  • 9
  • 24
  • 81
0

This may work.

df %>%
  mutate(lab3 = ifelse(!is.na(lab2), lab2, lab1))

   lab1  lab2  lab3
  <dbl> <dbl> <dbl>
1     5     7     7
2     8    10    10
3    NA     3     3
4     9    NA     9
5    NA    NA    NA
Park
  • 14,771
  • 6
  • 10
  • 29