Substitute NA values with another column's values

Question

I have 2 columns with survey answers:

subject_id | income | income unknown |
1          | 1000   | NA             |
2          | 15000  | NA             |
3          | NA     | 15000          |
4          | NA     | 10000          |
5          | 15000  | NA             |

I would like to merge the two columns. The second column has a value if the 1st income column was not answered (in most cases at least).

I thought of substituting NAs with zeros and then summing the two columns up, but that doesn't really work.

So, how many columns you want in expected? – akrun Apr 04 '17 at 12:12 — akrun, Apr 04 '17 at 12:12
You want `coalesce` from `dplyr` – Sotos Apr 04 '17 at 12:12 — Sotos, Apr 04 '17 at 12:12

akrun · Accepted Answer · 2017-04-04T12:27:17.797

2

We can use coalesce

library(tidyverse)
df1 %>%
    transmute(subject_id, income = coalesce(income, income_unknown))
#   subject_id income
#1          1   1000
#2          2  15000
#3          3  15000
#4          4  10000
#5          5  15000

Other base R options include rowSums, pmax, pmin etc if there is only one non-NA element per row

df1$income <- rowSums(df1[-1], na.rm = TRUE)
df1$income <- do.call(pmax, c(df1[-1], na.rm = TRUE))

data

df1 <- structure(list(subject_id = 1:5, income = c(1000L, 15000L, NA, 
NA, 15000L), income_unknown = c(NA, NA, 15000L, 10000L, NA)),
.Names = c("subject_id", 
"income", "income_unknown"), class = "data.frame",
 row.names = c(NA, -5L))

edited Apr 04 '17 at 12:27

answered Apr 04 '17 at 12:13

akrun

874,273
37
540
662

the new income column becomes NULL when i use coalesce – PCUnique Apr 04 '17 at 12:25
@PCUnique It is not happening on my side with `dplyr_0.5.0` – akrun Apr 04 '17 at 12:25

score 2 · Answer 2 · answered Apr 04 '17 at 12:19

2

You can use ifelse()

df$income.new <- ifelse(is.na(df$income), df$income_unknown, df$income)

answered Apr 04 '17 at 12:19

jogo

12,469
11
37
42

Substitute NA values with another column's values

2 Answers2

data