2

I have got multiple columns. All columns do have NA values in some rows. Is it possible to unite these columns without having the NA values in the new column?

Without NA values:

library(dplyr)
unite(mtcars, 'mpg_am', c('mpg','am'))

Creating fake data:

mtcars$NA_1 = ifelse(mtcars$mpg>20, NA, mtcars$mpg)
mtcars$NA_2 = ifelse(mtcars$cyl>6, NA, mtcars$mpg)
unite(mtcars, 'Var1', c('NA_1','NA_2'))

This will create values like

Var1
NA_21
15.5_NA
NA_NA
15.5_21
...

desired output:

Var1
21
15.5
NA
15.5_21
...
Koot6133
  • 1,428
  • 15
  • 26

2 Answers2

3

We can use unite with na.rm

library(tidyverse)
mtcars %>%
    rownames_to_column('rn') %>%
    mutate_at(vars(starts_with("NA")), as.character) %>%
    unite(Var1, NA_1, NA_2, na.rm = TRUE) %>%
    mutate(Var1 = na_if(Var1, "")) %>%
    column_to_rownames('rn')

Or another option is coalesce instead of unite

mtcars %>%
    mutate(Var1 = str_c(coalesce(NA_1, NA_2), coalesce(NA_2, NA_1), sep="_")) 

Or another option is

mtcars %>%
   mutate_at(vars(starts_with("NA")), list(~ replace_na(., ''))) %>% 
   mutate(Var1 = str_remove(na_if(str_c(NA_1, NA_2, sep="_"), '_'), '^_|_$') ) %>% 
   select(-NA_1, NA_2)
akrun
  • 874,273
  • 37
  • 540
  • 662
  • This will only work in the development version of Tidyr though. The na.rm parameter is not yet implemented in the CRAN version of Tidyr. – Koot6133 Aug 26 '19 at 14:49
  • 1
    It was more a note for other stackoverflow users wondering why the solutions does not work with the CRAN version of tidyr. Thank you for providing also a possibility without using the development version of tidyr. – Koot6133 Aug 27 '19 at 07:46
3

unite has got na.rm parameter which will remove NA values but for that column needs to be of character type.

library(dplyr)
library(tidyr)

mtcars %>%
   mutate_at(vars(NA_1, NA_2), as.character) %>%
   unite(Var1, NA_1, NA_2, na.rm = TRUE)

#    mpg cyl  disp  hp drat    wt  qsec vs am gear carb      Var1
#1  21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4        21
#2  21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4        21
#3  22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1      22.8
#4  21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1      21.4
#5  18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2          
#6  18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1 18.1_18.1
#.....

However, if both the values are NA then this will return empty values instead of NA, if we need NA strictly we can check for empty values and replace

mtcars %>%
   mutate_at(vars(NA_1, NA_2), as.character) %>%
   unite(Var1, NA_1, NA_2, na.rm = TRUE)
   mutate(Var1 = replace(Var1, Var1 == "", NA_character_))

Without any packages we can use paste0 in base R

cols <- c('NA_1','NA_2')
mtcars["V1"] <- apply(mtcars[cols],1,function(x) paste0(na.omit(x), collapse = "-"))
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
  • This will only work in the development version of Tidyr though. The na.rm parameter is not yet implemented in the CRAN version of Tidyr. – Koot6133 Aug 26 '19 at 14:47