0

Just a specific python question here. Let's say that I have a string like:

s = 'John 000117'

I want the output to be:

s_output = 'John 117'

How would I go about doing this? In general, I need to assume that the input string is a combination of alphabetical characters and numerics, and that the numeric parts of the strings may or may not have 0's padded in front of them. I also want to preserve all parts of the input string that are not numeric, as I am only touching the numeric parts. Adding a lot of spaces is ok because I can just use another regular expression to clean any excess spaces up.

My Initial Attempt

Initially, I mistakenly did the regular expression subtitution:

re.sub('^0*', ' ', s)

because I thought this would replace 0 or more occurrences of the 0's at the beginning of my string with a space, but I realized that this doesn't locate the substring that has leading zeros (this simply adds a space to the beginning of my string s). My question, more specifically, is, how do I locate substrings that have numerics padded with 0's and then use a regular expression like the one above?

Any help is appreciated, especially if it's something that can use a regular expression (I would prefer not to split or substring but I will if that's the only way).

AndrewJaeyoung
  • 368
  • 3
  • 10
  • 2
    A word boundary and `+` quantifier, `re.sub(r'\b0+', '', s)`, see https://regex101.com/r/h23TRE/1. Or, `re.sub(r'\b0+(\d)', r'\1', s)` – Wiktor Stribiżew Apr 10 '22 at 16:14
  • 1
    try `re.sub(r'(0+)([^0]+)', r'\2', s)` – Epsi95 Apr 10 '22 at 16:18
  • int("000117") ? – Joran Beasley Apr 10 '22 at 16:19
  • @WiktorStribiżew My question was downvoted because there was already another question similar to mine (https://stackoverflow.com/questions/64836950/remove-leading-zeros-from-python-complex-executable-string), but in my opinion, yours is much more elegant and readable for my specific task. I didn't even know about the word boundary \b. Thank you for teaching me a new thing. – AndrewJaeyoung Apr 10 '22 at 16:29
  • @Epsi95 Appreciate the response. This was definitely more along the lines of the majority of failed attempts (referencing a 0 in my string somewhere), but I had no clue how to reference the capture group. Thank you for teaching me something new. – AndrewJaeyoung Apr 10 '22 at 16:36
  • 1
    Yeah, but it is not still quite clear/final, word boundary might not work, say, in case you have `Text 0060.005`. It will also remove zeros after the decimal separator, and you would need `(?<!\d\.)` together with the word boundary. There are a lot of similar titled questions, I checked several ones and chose the closest one to close the question. You might want to precise the question and it might get reopened then. – Wiktor Stribiżew Apr 10 '22 at 16:49

0 Answers0