-1

I have a data frame with the character variable ID. It has IDs with 9 digits and 3 other values: blank, N/A and NA. I want to replace blank, N/A and NA with 999999999.

I tried using this:

df$id <- gsub('','999999999', df$id)

But it replaces all blanks (even with valid IDs). What is the best way to do this?

id <- c("", "N/A", "123456789", "NA","123456789")
> dummydata <- data.frame(id)
Pinky L
  • 3
  • 3
  • " all blanks (even with valid IDs)" You mean it replaces only the blank? Or it replaces every value with blank? – TylerH Jun 30 '17 at 20:38
  • 4
    Please make a [reproducible example](https://stackoverflow.com/a/5963610/6103040). – F. Privé Jun 30 '17 at 20:44
  • Please provide a sample of your data using `dput`. If the data is too long, you can provide a sample with `dput(head(df$id, 15))` – G5W Jun 30 '17 at 20:44
  • It replaces every blank space, so in a valid ID, it will have 999999999 ID 999999999. – Pinky L Jun 30 '17 at 20:55
  • 1
    `library(tidyverse); df %>% mutate(id = parse_number(id), id = coalesce(id, 999999999))` – alistaire Jun 30 '17 at 21:00

1 Answers1

1

Consider this reproducible example:

set.seed(100)
dt <- data.frame(id = sample(rep(c(1:10, c(NA, "N/A", " ", "")), 2)))
replace_value <- 999999999

dt$orig <- dt$id
dt$id                    <- gsub(" ", replace_value, dt$id)
dt$id                    <- gsub("N/A", replace_value, dt$id)
dt$id[is.na(dt$id)]      <- replace_value
dt$id[nchar(dt$id) == 0] <- replace_value  
          id    orig
1          2    2
2          8    8
3  999999999     
4         10   10
5          9    9
6          8    8
7  999999999     
8          5    5
9          4    4
10 999999999  N/A
11         4    4
12         3    3
13         6    6
14 999999999  N/A
15 999999999 <NA>
16 999999999     
17 999999999     
18         9    9
19         7    7
20        10   10
21         2    2
22         3    3
23 999999999 <NA>
24         1    1
25         5    5
26         6    6
27         1    1
28         7    7
jsta
  • 3,216
  • 25
  • 35