0

I am trying to concoctate two columns in R using:

df_new$conc_variable <- paste(df$var1, df$var2)

My dataset look as follows:

id  var1   var2
1   10     NA
2   NA     8
3   11     NA
4   NA     1

I am trying to get it such that there is a third column:

id  var1   var2  conc_var
1   10     NA    10
2   NA     8     8
3   11     NA    11
4   NA     1     1

but instead I get:

id  var1   var2  conc_var
1   10     NA    10NA
2   NA     8     8NA
3   11     NA    11NA
4   NA     1     1NA

Is there a way to exclude NAs in the paste process? I tried including na.rm = FALSE but that just added FALSE add the end of the NA in conc_var column. Here is the dataset:

id <- c(1,2,3,4)
var1 <- c(10, NA, 11, NA)
var2 <- c(NA, 8, NA, 1)
df <- data.frame(id, var1, var2)
zx8754
  • 52,746
  • 12
  • 114
  • 209

3 Answers3

3

One out of many options is to use ifelse as in:

df <- data.frame(var1 = c(10, NA, 11, NA),
                 var2 = c(NA, 8, NA, 1))

df$new <- ifelse(is.na(df$var1), yes = df$var2, no = df$var1)


print(df)

Depending on the circumstances rowSums might be suitable as well as in

df$new2 <- rowSums(df[, c("var1", "var2")], na.rm = TRUE)

print(df)
Bernhard
  • 4,272
  • 1
  • 13
  • 23
1

You can use tidyr::unite -

df <- tidyr::unite(df, conc_var, var1, var2, na.rm = TRUE, remove = FALSE)
df

#  id conc_var var1 var2
#1  1       10   10   NA
#2  2        8   NA    8
#3  3       11   11   NA
#4  4        1   NA    1

Like in the example if in a row at a time you'll have only one value you can also use pmax or coalesce.

pmax(df$var1, df$var2, na.rm = TRUE)
dplyr::coalesce(df$var1, df$var2)
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
0

You could use glue from the glue package instead.

glue::glue(10, NA, .na = '')

rbasa
  • 452
  • 3
  • 5