Merging two columns into one in R

Question

I have the following data frame, and am trying to merge the two columns into one, while replacing NA's with the numeric values.

ID    A     B
1     3     NA
2     NA    2
3     NA    4
4     1     NA

The result I want is:

Thanks in advance!

score 36 · Answer 1 · answered May 17 '17 at 17:58

This probably didn't exist when the answers were written, but since I came here with the same question and found a better solution, here it is for future googlers:

What you want is the coalesce() function from dplyr:

y <- c(1, 2, NA, NA, 5)
z <- c(NA, NA, 3, 4, 5)
coalesce(y, z)

[1] 1 2 3 4 5

score 16 · Answer 2 · answered Apr 14 '15 at 18:22

16

You can also do: with(d,ifelse(is.na(A),B,A))

where d is your data frame.

answered Apr 14 '15 at 18:22

User7598

1,658
1
15
28

5

Only works if NA is exclusive either in one or the other column! But good approach for this particular case! – Colonel Beauvel Apr 14 '15 at 18:28

Hao · Answer 3 · 2015-08-28T16:29:33.090

16

Another very simple solution in this case is to use the rowSums function.

df$New<-rowSums(df[, c("A", "B")], na.rm=T)
df<-df[, c("ID", "New")]

Update: Thanks @Artem Klevtsov for mentioning that this method only works with numeric data.

edited Aug 28 '15 at 16:29

answered Apr 14 '15 at 18:33

Hao

7,476
1
38
59

Also be careful that if both "A" and "B" columns have NAs then a 0 will be returned in "New". Not a problem in the test case . – thisisrg Oct 31 '16 at 02:38

score 15 · Answer 4 · answered Apr 14 '15 at 18:14

15

You can use unite from tidyr:

library(tidyr)

df[is.na(df)] = ''
unite(df, new, A:B, sep='')
#  ID new
#1  1   3
#2  2   2
#3  3   4
#4  4   1

answered Apr 14 '15 at 18:14

Colonel Beauvel

30,423
11
47
87

A:B could be replaced by column index like 1:2. – Decula Dec 06 '17 at 18:12

akrun · Answer 5 · 2015-04-14T18:14:02.507

9

You could try

New <- do.call(pmax, c(df1[-1], na.rm=TRUE))

Or

New <-  df1[-1][cbind(1:nrow(df1),max.col(!is.na(df1[-1])))]
d1 <- data.frame(ID=df1$ID, New)
d1
#  ID New
#1  1   3
#2  2   2
#3  3   4
#4  4   1

edited Apr 14 '15 at 18:14

answered Apr 14 '15 at 18:10

akrun

874,273
37
540
662

score 6 · Answer 6 · answered Apr 14 '15 at 21:39

6

Assuming either A or B have a NA, that would work just fine:

# creating initial data frame (actually data.table in this case)
library(data.table)
x<- as.data.table(list(ID = c(1,2,3,4), A = c(3, NA, NA, 1), B = c(NA, 2, 4, NA)))
x
#   ID  A  B
#1:  1  3 NA
#2:  2 NA  2
#3:  3 NA  4
#4:  4  1 NA


#solution
y[,New := na.omit(c(A,B)), by = ID][,c("A","B"):=NULL]
y
#   ID New
#1:  1   3
#2:  2   2
#3:  3   4
#4:  4   1

answered Apr 14 '15 at 21:39

Krome

371
2
10

2

nice! Also, `y[, New := pmin(A, B, na.rm=TRUE)]` – Arun Apr 14 '15 at 23:21

score 4 · Answer 7 · answered Mar 13 '19 at 21:48

4

This question's been around for a while, but just to add another possible approach that does not depend on any libraries:

df$new = t(df[-1])[!is.na(t(df[-1]))]

#   ID  A  B new
# 1  1  3 NA   3
# 2  2 NA  2   2
# 3  3 NA  4   4
# 4  4  1 NA   1

answered Mar 13 '19 at 21:48

dww

30,425
5
68
111

Merging two columns into one in R

7 Answers7

Linked

Related