1

This question is very easy to understand, but I can't wrap my head around how to get a solution. Let's say I have a vector and I want to modify it so it would have 5 integers at the end, and missing digits are replaced with zeros:

Smth1     Smth00001
Smth22    Smth00022
Smth333   Smth00333
Smth4444  Smth04444
Smth55555 Smth55555

I guess it can be done with regex and functions like gsub, but don't understand how to take into account the length of the replaced string

Poiu Rewq
  • 182
  • 13
  • not really, dww. One person proposed str_pad from stringr and it really solves the issue, though the person asked about smth else – Poiu Rewq Jul 14 '16 at 19:59

5 Answers5

2

Here's an idea using stringi:

v <- c("Smth1", "Smth22", "Smth333", "Smth4444", "Smth55555")

library(stringi)
d <- stri_extract(v, regex = "[:digit:]+")
a <- stri_extract(v, regex = "[:alpha:]+")
paste0(a, stri_pad_left(d, 5, "0"))

Which gives:

[1] "Smth00022" "Smth00333" "Smth04444" "Smth55555"
Steven Beaupré
  • 21,343
  • 7
  • 57
  • 77
2

Using base R. Someone else can prettify the regex:

sprintf("%s%05d", gsub("^([^0-9]+)..*$", "\\1", x),
  as.numeric(gsub("^..*[^0-9]([0-9]+)$", "\\1", x)))

[1] "Smth00001" "Smth00022" "Smth00333" "Smth04444" "Smth55555"
Zelazny7
  • 39,946
  • 18
  • 70
  • 84
2

Here is a simple 1-line solution similar to Zelazny's but using a replace callback method inside a gsubfn using gsubfn library:

> library(gsubfn)
> v <- c("Smth1", "Smth22", "Smth333", "Smth4444", "Smth55555")
> gsubfn('[0-9]+$', ~ sprintf("%05d",as.numeric(x)), v)
[1] "Smth00001" "Smth00022" "Smth00333" "Smth04444" "Smth55555"

The regex [0-9]+$ (see the regex demo) matches 1 or more digits at the end of the string only due to the $ anchor. The matched digits are passed to the callback (~) and sprintf("%05d",as.numeric(x)) pads the number (parsed as a numeric with as.numeric) with zeros.

To only modify strings that have 1+ non-digit symbols at the start and then 1 or more digits up to the end, just use this PCRE-based gsubfn:

> gsubfn('^[^0-9]+\\K([0-9]+)$', ~ sprintf("%05d",as.numeric(x)), v, perl=TRUE)
[1] "Smth00001" "Smth00022" "Smth00333" "Smth04444" "Smth55555"

where

  • ^ - start of string
  • [^0-9]+\\K - matches 1+ non-digit symbols and \K will omit them
  • ([0-9]+) - Group 1 passed to the callback
  • $ - end of string.
Community
  • 1
  • 1
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
-1

Here a solution using the library stringr:

library(stringr)
library(dplyr)

num <- str_extract(v, "[1-9]+")
padding <- 9 - nchar(num)
ouput <- paste0(str_extract(v, "[^0-9]+") %>% 
                str_pad(width = padding, side = c("right"), pad = "0"), num)

The output is:

"Smth00001" "Smth00022" "Smth00333" "Smth04444" "Smth55555"
thepule
  • 1,721
  • 1
  • 12
  • 22
-1
library(stringr)
paste0(str_extract(v,'\\D+'),str_pad(str_extract(v,'\\d+'),5,'left', '0'))
#[1] "Smth00001" "Smth00022" "Smth00333" "Smth04444" "Smth55555"
Shenglin Chen
  • 4,504
  • 11
  • 11