1

I have a dataset that contains multiple observations per person. In some cases an individual will have their ethnicity recorded in some rows but missing in others. In R, how can I replace the NA's with the ethnicity stated in the other rows without having to manually change them?

Example:

PersonID        Ethnicity
   1                A
   1                A
   1                NA
   1                NA
   1                A
   2                NA
   2                B
   2                NA
   3                NA
   3                NA
   3                A
   3                NA

Need:

PersonID        Ethnicity
   1                A
   1                A
   1                A
   1                A
   1                A
   2                B
   2                B
   2                B
   3                A
   3                A
   3                A
   3                A
MJJ
  • 11
  • 1

1 Answers1

4

You could use fill from tidyr

df %>%
 group_by(PersonID)%>%
 fill(Ethnicity,.direction = "downup")

# A tibble: 12 x 2
# Groups:   PersonID [3]
   PersonID Ethnicity
      <int> <fct>    
 1        1 A        
 2        1 A        
 3        1 A        
 4        1 A        
 5        1 A        
 6        2 B        
 7        2 B        
 8        2 B        
 9        3 A        
10        3 A        
11        3 A        
12        3 A        
Onyambu
  • 67,392
  • 3
  • 24
  • 53