1

I have a data frame,

data.frame(Primary_key = c(100,100,100,100,200,200,200) , 
           values= c("buyer",NA,NA,NA,"seller",NA,NA))

I'm trying to get a desired output of,

data.frame(Primary_key = c(100,100,100,100,200,200,200) , 
           values= c("buyer","buyer","buyer","buyer","seller","seller","seller"))

This is a simplified version the original has 3 possible values and over 10,000 different primary keys.

Thinking of a dplyr way to do this but kind of stumped.

I'm trying to group_by the primary key and then use a replace function.

M--
  • 25,431
  • 8
  • 61
  • 93
yungFanta
  • 31
  • 5

1 Answers1

2

You can use tidyr::fill and dplyr::group_by together:

library(dplyr)
library(tidyr)

df1 %>% 
  group_by(Primary_key) %>% 
  fill(values, .direction = "downup")
#> # A tibble: 10 x 2
#> # Groups:   Primary_key [4]
#>    Primary_key values
#>          <dbl> <fct> 
#>  1          50 <NA>  
#>  2         100 buyer 
#>  3         100 buyer 
#>  4         100 buyer 
#>  5         100 buyer 
#>  6         200 seller
#>  7         200 seller
#>  8         200 seller
#>  9         300 both  
#> 10         300 both

Sample data: This data considers different cases that would happen in the actual data.

df1 <- data.frame(Primary_key = c(50,100,100,100,100,200,200,200,300,300), 
                  values= c(NA, NA,"buyer",NA,NA,"seller",NA,NA,NA,"both"))
M--
  • 25,431
  • 8
  • 61
  • 93