0

I have data which looks like this

Linking <- data.frame(
ID = c(round((runif(20, min = 10000, max = 99999)), digits = 0), NA, NA, NA, NA),
PSU = c(paste("A", round((runif(20, min = 10000, max = 99999)), digits = 0), sep = ''), NA, NA, NA, NA),
qtr = c(rep(1:10, 2), NA, NA, NA, NA)
)

Linking$Key <- paste(Linking$ID, Linking$PSU, Linking$qtr, sep = "_")
Linking$Key[c(21:24)] <- c("87654_A15467_1", "45623_A23456_2", "67891_A12345_4", "65346_A23987_7")

What I want to do is populate the NA values for ID, PSU, and qtr from the information from "Key", but only for the rows with NA values.

Does anyone know how to do this?

This code does what I want, but it does it for all values of each variable. I want to do this just for rows where the values are NA.

Linking2 <- Linking
Linking2$ID <- substr(Linking$Key,1,5)
Linking2$PSU <- substr(Linking$Key,7,12)
Linking2$qtr <- substr(Linking$Key, 14,15)
Laura
  • 499
  • 5
  • 13
  • This is basically [this](https://stackoverflow.com/questions/7069076/split-column-at-delimiter-in-data-frame) question but only for specific rows . – Ronak Shah Apr 03 '18 at 03:41

2 Answers2

2

The basic idea here is to assign using a logical index vector.

Linking$ID[is.na(Linking$ID)] <- substr(Linking$Key,1,5)[is.na(Linking$ID)]
De Novo
  • 7,120
  • 1
  • 23
  • 39
  • Got it. Thanks! – Laura Apr 03 '18 at 03:33
  • Your welcome! I was going to demonstrate the whole procedure with a loop, and then realized your extraction is a little complicated and not applied in the same way across the columns. – De Novo Apr 03 '18 at 03:35
  • A similar idea with `data.table` would be `setDT(Linking)[is.na(ID), ID := substr(Key, 1, 5)][]` – akrun Apr 03 '18 at 05:12
0

You can also use tidyr::separate with dplyr::coalesce to separate values from Key and fill NA values in the first three columns.

library(tidyverse);
Linking %>%
    separate(Key, into = paste0("tmp", 1:3), sep = "_") %>%
    mutate(ID = coalesce(tmp1), PSU = coalesce(tmp2), qtr = coalesce(tmp3)) %>%
    select(-tmp1, -tmp2, -tmp3);
#      ID    PSU qtr
#1  56421 A20914   1
#2  30912 A97582   2
#3  97547 A73397   3
#4  28277 A35255   4
#5  45107 A14867   5
#6  91099 A26196   6
#7  21117 A69721   7
#8  69065 A34741   8
#9  28781 A96847   9
#10 26098 A93942  10
#11 12680 A15705   1
#12 35231 A68219   2
#13 70556 A47693   3
#14 98281 A55697   4
#15 52687 A95023   5
#16 46129 A95540   6
#17 48721 A78827   7
#18 77989 A36778   8
#19 11757 A29458   9
#20 16575 A78892  10
#21 87654 A15467   1
#22 45623 A23456   2
#23 67891 A12345   4
#24 65346 A23987   7
Maurits Evers
  • 49,617
  • 4
  • 47
  • 68