-2

I have a list of US postal zip codes of 5 digits, but some lost their leading zeros. How do I add those zeros back in, while keeping others without the leading 0s intact? I tried formatC, springf, str_pad, and none of them worked, because I am not adding 0s to all values.

  • 1
    In the future, *"I tried ... and none of them worked"* does not help much: we are generally much better at helping with not-working code when we see the code, some input data, and your expected output. Please see https://stackoverflow.com/q/5963269, [mcve], and https://stackoverflow.com/tags/r/info for some good ways to make questions *reproducible*, which will speed-up and make more-relevant answers you may get. – r2evans Jul 12 '21 at 19:40
  • That was a good suggestion! Will keep that in mind when posting questions next time. – SilverSpringbb Jul 12 '21 at 19:50
  • I voted, but don't know how to accept an answer. – SilverSpringbb Jul 14 '21 at 19:42

4 Answers4

7

We can use sprintf

sprintf('%05d', as.integer(zipcodes))
akrun
  • 874,273
  • 37
  • 540
  • 662
3

In which way did str_pad not work?

https://www.rdocumentation.org/packages/stringr/versions/1.4.0/topics/str_pad

df<-data.frame(zip=c(1,22,333,4444,55555))

df$zip <- stringr::str_pad(df$zip, width=5, pad = "0")

[1] "00001" "00022" "00333" "04444" "55555"

M.Viking
  • 5,067
  • 4
  • 17
  • 33
2

Update:

As of the valuable comment of r2evans: My solution is not very efficient and to get leading 0 we have to modify the paste0 part slightly see here with a dataframe example:

sapply(df$zip, function(x){if(nchar(x)<5){paste0(0,x)}else{x}})

data:

df <- tribble(
    ~zip,
    7889,
    2345,
    45567,
    4394,
    34566,
    4392,
    4599)
df

Output:

[1] "07889" "02345" "45567" "04394" "34566" "04392" "04599"

Fist answer: This will add a trailing zero to each integer < 5 digits Where zip is a vector:

sapply(zip, function(x){if(nchar(x)<5){paste0(x,0)}else{x}})
TarJae
  • 72,363
  • 6
  • 19
  • 66
  • 1
    If it's a vector, both `nchar` and `paste0` are vectorized, is there a reason you're explicitly de-vectorizing this process? Also, I don't think this will work correctly, for two reasons: it pads the `0` on the *right*, and it only pads 1 regardless of the length of the string. – r2evans Jul 12 '21 at 19:35
  • 1
    Thanks r2evans. I mixed leading and trailing. I will update my answer. And also thank you for your advice according to efficiency! – TarJae Jul 12 '21 at 20:01
2

If they start as strings and you don't want to (or cannot) convert to integers first, then an alternative to sprintf is

vec <- c('1','11','11111')
paste0(strrep('0', pmax(0, 5 - nchar(vec))), vec)
# [1] "00001" "00011" "11111"

This will handle strings of any length, and is a no-op for strings of 5 or greater characters.

In a frame, that would be

dat$colname <- paste0(strrep('0', pmax(0, 5 - nchar(dat$colname))), dat$colname)
r2evans
  • 141,215
  • 6
  • 77
  • 149