1

I'm wondering if there's any way to find how many pair of parentheses are in a string.

I have to do some string manipulation and I sometimes have something like:

some_string = '1.8.0*99(0000000*kWh)'

or something like

some_string = '1.6.1*01(007.717*kW)(1604041815)'

What I'd like to do is:

  • get all the digits between the parentheses (e.g for the first string: 0000000)
  • if there are 2 pairs of parentheses (there will always be max 2 pairs) get all the digits and join them (e.g for the second string I'll have: 0077171604041815)

How can I verify how many pair of parentheses are in a string so that I can do later something like:

if number_of_pairs == 1:
    do_this
else:
    do_that

Or maybe there's an easier way to do what I want but couldn't think of one so far.

I know how to get only the digits in a string: final_string = re.sub('[^0-9]', '', my_string), but I'm wondering how could I treat both cases.

Casimir et Hippolyte
  • 88,009
  • 5
  • 94
  • 125

3 Answers3

4

As parenthesis always present in pairs, So just count the left or right parenthesis in a string and you'll get your answer.

num_of_parenthesis = string.count('(')
Hassan Mehmood
  • 1,414
  • 1
  • 14
  • 22
  • If there is a risk that there are unmatched parenthesis, you can also count how many `)` exist in the string to handle those cases too... – arewm Jun 15 '16 at 12:17
  • @arewm there won't be any risk because in a correct expression, parenthesis always present in pairs. – Hassan Mehmood Jun 15 '16 at 14:47
  • I agree that there is not much risk if there is a correct expression. I offered the suggestion because this check could be done done before we know that the expression is correct, giving you the opportunity for graceful error handling. Might not be relevant here, but just a reminder for others that might look on in the future. – arewm Jun 15 '16 at 18:46
0

If you want all the digits in a single string, use re.findall after replacing any . and join into a single string:

In [15]: s="'1.6.1*01(007.717*kW)(1604041815)'"

In [16]: ("".join(re.findall("\((\d+).*?\)", s.replace(".", ""))))
Out[16]: '0077171604041815'

In [17]: s = '1.8.0*99(0000000*kWh)'
In [18]: ("".join(re.findall("\((\d+).*?\)", s.replace(".", ""))))
Out[18]: '0000000'

The count of parens is irrelevant when all you want is to extract any digits inside them. Based on the fact "you only have max two pairs" I presume the format is consistent.

Or if the parens always have digits, find the data in the parens and sub all bar the digits:

In [20]:  "".join([re.sub("[^0-9]", "", m) for m in  re.findall("\((.*?)\)", s)])
Out[20]: '0077171604041815'
Padraic Cunningham
  • 176,452
  • 29
  • 245
  • 321
0

You can do that: (assuming you already know there's at least one parenthese)

re.sub(r'[^0-9]+', '', some_string.split('(', 1)[1])

or only with re.sub:

re.sub(r'^[^(]*\(|[^0-9]+', '', some_string)
Casimir et Hippolyte
  • 88,009
  • 5
  • 94
  • 125