I have a large dataset of addresses, which include U.S. zipcodes. Some of the zipcodes are in five-digit format, and others are in nine-digit format. Regardless of format, if the zipcode has a leading zero (like many in Rhode Island), the leading zero has been dropped. So, I need to go through the d$zip column and identify observations where the zip is either length 4 or length 8 and then paste0("0"+d$zip)in its place to add back the leading zero. My question is how to efficiently get the conditional check written, given that I have almost 100,000 addresses.
Here is a toy df:
structure(list(ID = 1:3, street = c("555 Mockingbird Way", "909 Deadend Alley",
"1475 Wrongway Rd"), city = c("Anywhere", "Over There", "Nowhere"
), state = c("RI", "RI", "TX"), zip = c("2863", "28632142", "78215"
)), class = "data.frame", row.names = c(NA, -3L))
Note: There are two relevant questions already, but they do not address the check for 4 or 8 digit format.