R: How to fill in NA Values within a Column based on grouping?

Question

I'm looking to replace the NA values in this example data frame with either 'A' or 'B' depending on their 'second' column category: (A for A1, B for B1)

df <- data.frame(first = c("A","A",NA,NA,"B",NA,NA,NA),second = c(rep("A1",4),rep("B1",4)))
df
  first second
1     A     A1
2     A     A1
3  <NA>     A1
4  <NA>     A1
5     B     B1
6  <NA>     B1
7  <NA>     B1
8  <NA>     B1

This is what I would like the resulting data frame to look like:

  first second
1     A     A1
2     A     A1
3     A     A1
4     A     A1
5     B     B1
6     B     B1
7     B     B1
8     B     B1

I tried this solution but obviously it didn't work:

df$first[is.na(df$first)] <- unique(df[!is.na(df$first),"first"])

I have a feeling there might be a dplyr solution but cannot think of it.

Thank you!

`df$first[is.na(df$first)] = strsub(df$second[is.na(df$first)], 1, 1)` — tblznbits, Oct 19 '17 at 21:33
I don't think this is an exact duplicate of question 23340150. The aim here is to replace NA based on the value of a second column, not the most recent non-NA of the same column. — neilfws, Oct 19 '17 at 21:48

Maurits Evers · Accepted Answer · 2017-10-19T21:40:52.960

1

No need for dplyr. This should work in base R:

df$first[is.na(df$first)] <- gsub("(\\w)\\d", "\\1", df$second[is.na(df$first)])

Explanation: Here, gsub replaces NA entries in first with entries from second, by matching [letter][digit] from second and replacing with [letter].

  first second
1     A     A1
2     A     A1
3     A     A1
4     A     A1
5     B     B1
6     B     B1
7     B     B1
8     B     B1

edited Oct 19 '17 at 21:40

answered Oct 19 '17 at 21:36

Maurits Evers

49,617
4
47
68

I believe best practice is to avoid regular expressions when possible. – tblznbits Oct 19 '17 at 21:39
2

I disagree. Given that `second` is a string, regexp is the way to go. It allows for way more flexibility than substring extractions based on coordinates... – Maurits Evers Oct 19 '17 at 21:42

R: How to fill in NA Values within a Column based on grouping?

1 Answers1