1

I have one data sets with name DATA_TEST.This data frame contain 7-observations in character format.You can see table below.

#DATA SET
DATA_TEST<-data.frame(
  Ten_digits=c("NA","207","0101","0208 90","0206 90 99 00","103","9706 00 00 00"),
  stringsAsFactors = FALSE)
View(DATA_TEST)

enter image description here

So my intention is to convert this data frame with a stringr or other package like picture below. Actually the code needs to do one thing or more precisely first must found only variables with 10 digits like "0206 90 99 00" or "9706 00 00 00" and convert this variables into variables without space "0206909900" and "9706000000". In the table below you can see finally what the table should look like.

enter image description here

So can anybody help me how to resolve this problem?

silent_hunter
  • 2,224
  • 1
  • 12
  • 30

2 Answers2

2

You can try with stringr and dplyr:

DATA_TEST %>%
 mutate(Ten_digits = if_else(str_count(Ten_digits, "[0-9]") == 10,
                            str_replace_all(Ten_digits, fixed(" "), ""),
                            Ten_digits))

  Ten_digits
1         NA
2        207
3       0101
4    0208 90
5 0206909900
6        103
7 9706000000

Or with stringr and base R:

with(DATA_TEST, ifelse(str_count(Ten_digits, "[0-9]") == 10,
                        str_replace_all(Ten_digits, fixed(" "), ""),
                        Ten_digits))
tmfmnk
  • 38,881
  • 4
  • 47
  • 67
0

One way could be to calculate number of characters after removing whitespaces and replace only the value where number of characters are 10.

temp <- gsub("\\s", "", DATA_TEST$Ten_digits)
DATA_TEST$Ten_digits[nchar(temp) == 10] <- temp[nchar(temp) == 10]

DATA_TEST
#  Ten_digits
#1         NA
#2        207
#3       0101
#4    0208 90
#5 0206909900
#6        103
#7 9706000000
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213