2

I am working with Grammatical Evolution (GE) on Python 3.7. My grammar generates executable strings in the format:

np.where(<variable> <comparison_sign> <constant>, (<probability1>), (<probability2>))

Yet, the string can get quite complex, with several chained np.where .

<constant> in some cases contains leading zeros, which makes the executable string to generate errors. GE is supposed to generate expressions containing leading zeros, however, I have to detect and remove them. An example of a possible solution containing leading zeros:

"np.where(x < 02, np.where(x > 01.5025, (0.9), (0.5)), (1))"

Problem:

  • There are two types of numbers containing leading zeros: int and float.
  • Supposing that I detect "02" in the string. If I replace all occurrences in the string from "02" to "2", the float "01.5025" will also be changed to "01.525", which cannot happen.

I've made several attempts with different re patterns, but couldn't solve it. To detect that an executable string contains leading zeros, I use:

try:
  _ = eval(expression)
except SyntaxError:
  new_expression = fix_expressions(expression)

I need help building the fix_expressions Python function.

  • is this a solution for you: https://stackoverflow.com/questions/13142347/how-to-remove-leading-and-trailing-zeros-in-a-string-python ? – Alexander Riedel Nov 14 '20 at 18:05
  • Partially. Two things missing: detect solely numbers with leading zeros and replace only those occurrences, without changing other numbers that contain them partially. Example: replacing "02" with "2" without changing "0.025" to "0.25". – Pedro Pereira Nov 14 '20 at 18:33

2 Answers2

1

You could try to come up with a regular expression for numbers with leading zeros and then replace the leading zeros.

import re

def remove_leading_zeros(string):
    return re.sub(r'([^\.^\d])0+(\d)', r'\1\2', string)

print(remove_leading_zeros("np.where(x < 02, np.where(x > 01.5025, (0.9), (0.5)), (1))"))

# output: np.where(x < 2, np.where(x > 1.5025, (0.9), (0.5)), (1))

The remove_leading_zeros function basically finds all occurrences of [^\.^\d]0+\d and removes the zeros. [^\.^\d]0+\d translates to not a number nor a dot followed by at least one zero followed by a number. The brackets (, ) in the regex signalize capture groups, which are used to preserve the character before the leading zeros and the number after.


Regarding Csaba Toth's comment:

The problem with 02+03*04 is that there is a zero at the beginning of the string. One can modify the regex such that it matches also the beginning of the string in the first capture group:

r"(^|[^\.^\d])0+(\d)"
upe
  • 1,862
  • 1
  • 19
  • 33
0

You can remove leading 0's in a string using .lstrip()

str_num = "02.02025"

print("Initial string: %s \n" % str_num)

str_num = str_num.lstrip("0")

print("Removing leading 0's with lstrip(): %s" % str_num)
BWallDev
  • 345
  • 3
  • 13
  • 1
    This is applicable to cases where I have a string containing only numbers. Notice that my string expressions are more complex. Even if I can detect only numbers containing leading zeros, how can I replace them in the whole expression, without affecting other numbers? Example: replacing "02" with "2" without changing "0.025" to "0.25". – Pedro Pereira Nov 14 '20 at 18:37
  • If your goal is to remove leading 0's lstrip() doesn't mind if the string is all numbers or both numbers and characters. if str_num was equal to 02.025 it will return 2.025 – BWallDev Nov 14 '20 at 18:45
  • I tried: "np.where(x < 02, np.where(x > 01.5025, (0.9), (0.5)), (1))".lstrip("0") and i got: "np.where(x < 02, np.where(x > 01.5025, (0.9), (0.5)), (1))" So either I can't explain my problem, or I don't understand your suggestion... – Pedro Pereira Nov 14 '20 at 18:56
  • the string that represents `` is what you would want to use the lstrip("0") function on. However, maybe I am misunderstanding what you're asking. – BWallDev Nov 14 '20 at 19:04
  • My problem is to detect only those occurrences in the whole string and replace only those ones. Example: I detect "02", apply ```lstrip()``` and replace it in the entire string with ```str.replace("02", "2")```. By doing this, the number "0.025" will also be replaced with "0.25", since "02" is a substring of "0.025". Right? – Pedro Pereira Nov 14 '20 at 19:12
  • Copy the code I put above in my answer into a new file and run it and see the output results. I'm not sure what your code looks like so I am not sure how everything is structured. Maybe this won't help your circumstances – BWallDev Nov 14 '20 at 19:46
  • ```re.sub(r'([^\.^\d])0+(\d)', r'\1\2', string)``` this solved my problem. Thank you anyway! – Pedro Pereira Nov 14 '20 at 20:02