"tidyr like" fill na from different column

Question

I've a data frame with missing values is some column (who doesn't). For example:

df <- data.frame(x = c(2,NA,4), y = 5:7)
df
   x y
1  2 5
2 NA 6
3  4 7

I would like to replace the missing value with a value of a different column. Obviously there are a lot of ways to do so, for example:

 df %>%
   mutate(x = ifelse(is.na(x), y, x))

  x y
1 2 5
2 6 6
3 4 7

However, I am looking for something more elegant, like

df %>% fill(x,y)

but couldn't find anything. Does something like this exist?

Thanks!

Instead of ifelse you could use dplyr's `coalesce` function, i.e. `df %>% mutate(x = coalesce(x, as.numeric(y)))` — talat, Nov 09 '17 at 11:51
If there was a pure `tidyr` solution, I bet it would have appeared here: [How to implement coalesce efficiently in R](https://stackoverflow.com/questions/19253820/how-to-implement-coalesce-efficiently-in-r) — Henrik, Nov 09 '17 at 12:17

Gregor Thomas · Accepted Answer · 2017-11-09T14:41:43.267

You want to change values in a single column, keeping the same number of rows. The tidyverse way to do that is dplyr::mutate, and the tidyverse implementation of the specific operation you want is dplyr::coalesce, as docendo discimus suggested:

df %>% mutate(x = coalesce(x, y))

Things would be less tidy and less consistent if there was a single function that combined these two steps, as it is not the whole data frame being operated on, just a single column. It would also be less flexible, as coalesce can be used on vectors whether or not they are in a data frame, which is good!

(I actually dislike tidyr::fill - I suppose it is consistent because it operates on all columns of the data frame, but I would prefer that it took a single vector and was typically used inside mutate. mutate_all(fill) would be easy enough to do the whole data frame. So I end up still relying zoo::na.locf for general use.)

Thanks you @Gregor, I was not familiar with `coalesce` and it indeed addresses my need. — Adiel Loinger, Nov 10 '17 at 05:17

denis · Answer 2 · 2017-11-09T14:24:22.173

3

I am aware I don't fully answer the question, but I find the standard data frame way not so bad :

df$x[is.na(df$x)] <- df$y[is.na(df$x)]

and the data.table way quite simple and elegant:

df[is.na(x),x := y]

edited Nov 09 '17 at 14:24

answered Nov 09 '17 at 12:47

denis

5,580
1
13
40

Thanks you @denis, I agree the data.table way is simple and elegant, but in genral I'm more a tidyverse user – Adiel Loinger Nov 10 '17 at 05:14

score 0 · Answer 3 · answered Nov 09 '17 at 13:00

0

try this, good luck

df <- t(apply(df, 1, function(x) if(any(is.na(x))) rep(x[!is.na(x)], 2) else x))
as.data.frame(df)

answered Nov 09 '17 at 13:00

myincas

1,500
10
15

This will convert to a matrix and thus destroy any `class` differences present in the data frame. – Gregor Thomas Nov 09 '17 at 13:17

"tidyr like" fill na from different column

3 Answers3