0

My df1 looks like

PID End_record_date
123 NA
123 15-08-2020
234 NA
234 19-07-2020
345 NA

my df2 has matched ids

PID Record_date
123 13-10-2018
234 14-07-2019
345 20-08-2020

Expected result i need

PID End_record_date
123 13-10-2018
123 15-08-2020
234 14-07-2019
234 19-07-2020
345 20-08-2020

I need only NAs to be filled without disturbing other values

Thanks

Katrix_02
  • 83
  • 6

1 Answers1

2

You can use match :

inds <- is.na(df1$End_record_date)
df1$End_record_date[inds] <- df2$Record_date[match(df1$PID[inds], df2$PID)]
df1

#  PID End_record_date
#1 123      13-10-2018
#2 123      15-08-2020
#3 234      14-07-2019
#4 234      19-07-2020
#5 345      20-08-2020

Other option is to join the two dataframes and select the first non-NA value from the two.

This can be done in base R as :

transform(merge(df1, df2, by = 'PID'), End_record_date = 
           ifelse(is.na(End_record_date), Record_date, End_record_date))

Or in dplyr :

library(dplyr)

inner_join(df1, df2, by = 'PID') %>%
  mutate(End_record_date = coalesce(End_record_date,Record_date)) %>%
  select(PID, End_record_date)
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213