3

I am aware of all of the questions regarding the Adding leading zero and the comprehensive responses provided for them such as Q1, Q2, Q3.

But to me , at least based on my current knowledge, I am not able to address what I am going to do as follow:

  • add the leading zero in a string using regex pattern match So, I want to add leading zero only to digits after the -.


for example :

Sam <- c("222-88", "537-457", "652-1", "787-892")
var <- LETTERS[1:4]
DF<- data.frame(Sam, var)
DF
      Sam var
1  222-88   A
2  537-457  B
3  652-1    C
4  787-892  D

Expected results:

     Sam   var
1  222-088   A
2  537-457   B
3  652-001   C
4  787-892   D

I tried :

library(stringr)
temp <- DF[str_detect(DF$Sam, "-[0-9]{1,2}$"),] # will find the rows need the leading zero
temp 
     Sam var
1 222-88   A
3  652-1   C

formatC(temp$Sam, width = 2,flag = 0)# not correct!
Daniel
  • 1,202
  • 2
  • 16
  • 25

4 Answers4

3

We can do this with base R, by splitting the string by - and then use sprintf to pad the 0's after converting to numeric and then paste

DF$Sam <- sapply(strsplit(as.character(DF$Sam), "-"), function(x) 
       paste(x[1],sprintf("%03d", as.numeric(x[2])), sep="-"))
DF$Sam
#[1] "222-088" "537-457" "652-001" "787-892"

If we need a regex approach we can use gsubfn

library(gsubfn)
gsubfn("(\\d+)$", ~sprintf("%03d", as.numeric(x)), as.character(DF$Sam))
#[1] "222-088" "537-457" "652-001" "787-892"
akrun
  • 874,273
  • 37
  • 540
  • 662
  • Thnk, Can we do it with `stringr` or any other `packages`? another words is there any other easier alternatives? – Daniel Aug 30 '17 at 19:53
  • 1
    @Daniel Updated with a `gsubfn` approach which uses `sprintf` so that it will prevent any errors – akrun Aug 30 '17 at 19:57
3

Another base option

Sam <- c("222-88", "537-457", "652-1", "787-892")
m <- gregexpr("[0-9]+$", Sam)
regmatches(Sam, m) <- sprintf('%03s', unlist(regmatches(Sam, m)))
Sam

# [1] "222-088" "537-457" "652-001" "787-892"
rawr
  • 20,481
  • 4
  • 44
  • 78
1

An alternative in Base R is

DF$Sam = sub("-(\\d)\\b", "-00\\1", DF$Sam)
DF$Sam = sub("-(\\d\\d)\\b", "-0\\1", DF$Sam)
DF
      Sam var
1 222-088   A
2 537-457   B
3 652-001   C
4 787-892   D
G5W
  • 36,531
  • 10
  • 47
  • 80
1

Sticking with tidyverse you can try:


Sam <- c("222-88", "537-457", "652-1", "787-892")
var <- LETTERS[1:4]
df <- data.frame(Sam, var)

library(dplyr)
library(tidyr)
library(stringr)

df %>% 
  separate(Sam, c("sam1", "sam2")) %>% 
  mutate(Sam = str_c(sam1, "-", str_pad(sam2, 3, "left", "0"))) %>% 
  select(-sam1, -sam2)

#>   var     Sam
#> 1   A 222-088
#> 2   B 537-457
#> 3   C 652-001
#> 4   D 787-892

# OR

df %>% 
  mutate(
    sam_new = str_c(
      str_extract(Sam, "^\\d+-"),
      str_extract(Sam, "\\d+$") %>% str_pad(3, "left", "0")
    )
  )

#>       Sam var sam_new
#> 1  222-88   A 222-088
#> 2 537-457   B 537-457
#> 3   652-1   C 652-001
#> 4 787-892   D 787-892
austensen
  • 2,857
  • 13
  • 24