1

I have a data.frame, in this format:

A   w   x   y   z
0.23    1   NA  NA  NA
0.12    NA  2   NA  NA
0.45    NA  2   NA  NA
0.89    NA  NA  3   NA
0.12    NA  NA  NA  4

And I want to collapse w:x:y:z into a single column, while removing NA's. Desired result:

A   Comb
0.23    1
0.12    2
0.45    2
0.89    3
0.12    4

My approach so far is:

df %>% unite("Comb", w:x:y:z, na.rm=TRUE, remove=TRUE)

However, "Comb" is being populated with strings such as 1_NA_NA_NA and NA_NA_NA_4 i.e. it is not removing the NA's. I've tried switching to character NA's, but that leads to bizarre and unpredictable results. What am I doing wrong?

I'd also like to be able to do this when the original data.frame is populated with strings (in place of the numbers). Is there a method for this?

user3012926
  • 117
  • 1
  • 9

4 Answers4

2

Using dplyr::coalesce we can do the following:

df %>%
mutate(Comb = coalesce(w,x,y,z)) %>%
  select(A, Comb)

which gives the following output:

      A  Comb
  <dbl> <dbl>
1  0.23     1
2  0.12     2
3  0.45     2
4  0.89     3
5  0.12     4
monarque13
  • 568
  • 3
  • 6
  • 27
1

In unite, na.rm does not remove integer/factor columns.

Convert them to the character and then use unite.

library(dplyr)

df %>%
  mutate_at(vars(w:z), as.character) %>% 
  tidyr::unite('comb', w:z, na.rm = TRUE)

#     A comb
#1 0.23    1
#2 0.12    2
#3 0.45    2
#4 0.89    3
#5 0.12    4

data

df <- structure(list(A = c(0.23, 0.12, 0.45, 0.89, 0.12), w = c(1L, 
NA, NA, NA, NA), x = c(NA, 2L, 2L, NA, NA), y = c(NA, NA, NA, 
3L, NA), z = c(NA, NA, NA, NA, 4L)), class = "data.frame", 
row.names = c(NA, -5L))
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
1

Another option is fcoalesce from data.table

library(data.table)
setDT(df)[,  .(A, Comb = fcoalesce(w, x, y, z))]

data

df <- structure(list(A = c(0.23, 0.12, 0.45, 0.89, 0.12), w = c(1L, 
NA, NA, NA, NA), x = c(NA, 2L, 2L, NA, NA), y = c(NA, NA, NA, 
3L, NA), z = c(NA, NA, NA, NA, 4L)), class = "data.frame", 
row.names = c(NA, -5L))
akrun
  • 874,273
  • 37
  • 540
  • 662
0

Using na.omit.

dat <- transform(dat[1], Comb=apply(dat[-1], 1, na.omit))
#      A Comb
# 1 0.23    1
# 2 0.12    2
# 3 0.45    2
# 4 0.89    3
# 5 0.12    4

Data

dat <- structure(list(A = c(0.23, 0.12, 0.45, 0.89, 0.12), w = c(1L, 
NA, NA, NA, NA), x = c(NA, 2L, 2L, NA, NA), y = c(NA, NA, NA, 
3L, NA), z = c(NA, NA, NA, NA, 4L)), row.names = c(NA, -5L), class = "data.frame")
jay.sf
  • 60,139
  • 8
  • 53
  • 110