3

Example data.frame:

df <- data.frame(col_1=c("A", NA, NA), col_2=c(NA, "B", NA), col_3=c(NA, NA, "C"), other_col=rep("x", 3), stringsAsFactors=F)
df
  col_1 col_2 col_3 other_col
1     A  <NA>  <NA>         x
2  <NA>     B  <NA>         x
3  <NA>  <NA>     C         x

I can create a new column new_col filled with non-NA values from the 3 columns col_1, col_2 and col_3:

df %>% 
mutate(new_col = case_when(
   !is.na(col_1) ~ col_1,
   !is.na(col_2) ~ col_2,
   !is.na(col_3) ~ col_3,
   TRUE ~ "none"))

  col_1 col_2 col_3 other_col new_col
1     A  <NA>  <NA>         x       A
2  <NA>     B  <NA>         x       B
3  <NA>  <NA>     C         x       C

However, sometimes the number of columns from which I pick the new_col value can vary.

How could I check that the columns exist before applying the previous case_when command?

The following triggers an error:

df %>% 
select(-col_3) %>% 
mutate(new_col = case_when(
   !is.null(.$col_1) & !is.na(col_1) ~ col_1,
   !is.null(.$col_2) & !is.na(col_2) ~ col_2,
   !is.null(.$col_3) & !is.na(col_3) ~ col_3, 
   TRUE ~ "none"))

Error: Problem with `mutate()` input `new_col`.
   x object 'col_3' not found
   ℹ Input `new_col` is `case_when(...)`.
zx8754
  • 52,746
  • 12
  • 114
  • 209
u31889
  • 331
  • 1
  • 9

2 Answers2

4

I like Adam's answer, but if you want to be able to combine from col_1 and col_2 (assuming they both have values), you should use unite()

library(tidyverse)
df %>%
  unite(new_col, starts_with("col"), remove = FALSE, na.rm = TRUE)

Edit to respond to: "How could I check that the columns exist before applying the previous case_when command?"

You won't need to check with this command. And if your columns to unite aren't named consistently, replace starts_with("col") with c("your_name_1", "your_name_2", "etc.")

dyrland
  • 608
  • 1
  • 7
  • 17
3

You can use coalesce.

library(dplyr)

# vector of all the columns you might want
candidate_cols <- paste("col", 1:3, sep = "_")

# convert to symbol only the ones in the dataframe
check_cols <- syms(intersect(candidate_cols, names(df)))

# coalesce over the columns to check
df %>% 
  mutate(new_col = coalesce(!!!check_cols))

#  col_1 col_2 col_3 other_col new_col
#1     A  <NA>  <NA>         x       A
#2  <NA>     B  <NA>         x       B
#3  <NA>  <NA>     C         x       C