How to use Coalesce function on a dataframe

Question

I tried finding a way to handle NA data but unable to do so, Please assist me on this.

I have below data,

R1 <- c("15515","5156","65656","1566", "2857","8888","65656","1566","65651")
R2 <- c("515","5156.11-","415-","1455-","886","888","777","666","4457")
RC1 <- c("AW","FG","ZA","ZI","","CW","","","")
RC2 <- c("SSSBB","","","ZXXQA","","CQAER","","KKHDY","TTQWW")
RC3 <- c("KKAJDJHW","XVVJAKWA","","","","CDDGAJJA","GGGAJTTD","","BBNMNJJI")
df <- data.frame(R1,R2,RC1,RC2,RC3)

I am using below code to tackle NA among only RC1,RC2 & RC3

df_1$RCC <- with(df_1, coalesce(df_1$RC3,df_1$RC2,df_1$RC1))
df_1

I am unable to fetch data from RC1 & RC2. Need your Kind assistance.

score 2 · Accepted Answer · answered Aug 18 '20 at 07:23

2

For coalesce to work you need NA's and not blanks. Change the blanks to NA and try :

library(dplyr)

df[df == ''] <- NA
df %>% mutate(RCC = coalesce(RC3, RC2, RC1))

#    R1       R2  RC1   RC2      RC3      RCC
#1 15515      515   AW SSSBB KKAJDJHW KKAJDJHW
#2  5156 5156.11-   FG  <NA> XVVJAKWA XVVJAKWA
#3 65656     415-   ZA  <NA>     <NA>       ZA
#4  1566    1455-   ZI ZXXQA     <NA>    ZXXQA
#5  2857      886 <NA>  <NA>     <NA>     <NA>
#6  8888      888   CW CQAER CDDGAJJA CDDGAJJA
#7 65656      777 <NA>  <NA> GGGAJTTD GGGAJTTD
#8  1566      666 <NA> KKHDY     <NA>    KKHDY
#9 65651     4457 <NA> TTQWW BBNMNJJI BBNMNJJI

answered Aug 18 '20 at 07:23

Ronak Shah

377,200
20
156
213

Thank you, But unfortunately I am unable to achieve it, must be due to R version in my system. Can there be some other solution possible. – Vin Aug 18 '20 at 07:32
What's the issue? Does it give you incorrect result or gives an error message ? I assume you are assigning the data back for changes to reflect.`df <- df %>% mutate(RCC = coalesce(RC3, RC2, RC1))` – Ronak Shah Aug 18 '20 at 07:36
I assigned back to df, When I View(df) it is same as the one I pasted as image in my question statement. Only RC3 values are reflecting. – Vin Aug 18 '20 at 07:39
Are you using the same data shown in your post? – Ronak Shah Aug 18 '20 at 07:40
Yes the same one, I suspect it must be an issue with R version as I use older one. Not sure. – Vin Aug 18 '20 at 07:41
1

Issue with R version and probably `dplyr` as well. My `packageVersion('dplyr') #[1] ‘1.0.1’`. What is yours? Try with `df <- data.frame(R1,R2,RC1,RC2,RC3, stringsAsFactors = FALSE)` – Ronak Shah Aug 18 '20 at 07:45
I have the older one. packageVersion('dplyr') #[1] ‘0.7.8’. But using stringsAsFactors = FALSE, made the trick! Thank you! – Vin Aug 18 '20 at 07:47

score 2 · Answer 2 · answered Aug 18 '20 at 19:55

We can also do this without having to type all the column names

library(dplyr)# >= 1.0.0    
df %>% 
    mutate(across(everything(), na_if, "")) %>%
    mutate(RCC = coalesce(!!! select(., RC3:RC1)))
#    R1       R2  RC1   RC2      RC3      RCC
#1 15515      515   AW SSSBB KKAJDJHW KKAJDJHW
#2  5156 5156.11-   FG  <NA> XVVJAKWA XVVJAKWA
#3 65656     415-   ZA  <NA>     <NA>       ZA
#4  1566    1455-   ZI ZXXQA     <NA>    ZXXQA
#5  2857      886 <NA>  <NA>     <NA>     <NA>
#6  8888      888   CW CQAER CDDGAJJA CDDGAJJA
#7 65656      777 <NA>  <NA> GGGAJTTD GGGAJTTD
#8  1566      666 <NA> KKHDY     <NA>    KKHDY
#9 65651     4457 <NA> TTQWW BBNMNJJI BBNMNJJI

How to use Coalesce function on a dataframe

2 Answers2

Linked