2

I'm trying to find a smart and quick solution to extract some DATA from a string.

Basically i wanna get all text inside the '(...)'

Example:

ex_string= "My Cell Phone number is (21) 99715-5555"
return = 21

ex_string2 = "Apple (AAPL) have a great quarterly, but Microsoft (MSFT) have a better one"
return = ['AAPL', 'MSFT'] 

ex_string3 = "Hello World"
return = None

The trick is some strings will have just one item, another will have more then one and another none.

I know i can just .split('(') then start getting the items, but trying to find better solution for this case, because i will parse tons of string.

python_user
  • 5,375
  • 2
  • 13
  • 32
Felipe Cid
  • 97
  • 4

2 Answers2

2

You can use regular expressions.

Here is how I would write it:

import re

def find_enclosed(s): 
    # find all matches
    matches = re.findall(r"\((.*?)\)", s) 
    # if there are no matches return None
    if len(matches) == 0:
        return None
    # if it is a valid number change its type to a number
    for i in range(len(matches)):
        try:
            matches[i] = int(matches[i])
        except:
            pass
    # if there is only one match return it without a list
    if len(matches) ==  1:
        return matches[0]
    
    return matches

And this is how you would use it:

ex_string= "My Cell Phone number is (21) 99715-5555"
ex_string2 = "Apple (AAPL) have a great quarterly, but Microsoft (MSFT) have a better one"

matches1 = find_enclosed(ex_string1)
matches2 = find_enclosed(ex_string2)

print(matches1)
print(matches2)
Marko Borković
  • 1,884
  • 1
  • 7
  • 22
0

You should use RegExp to get the data surrounded by parenthesis. The build-in Python module named re can be used for it. The regex101.com site is a very useful RegExp tester where you can create the proper RegExp for yourself.

Code:

import re

# It founds everything inside parenthesis (numbers, strings, special characters etc...).
my_regexp = r"\((.*?)\)"

test_string_1 = "My Cell Phone number is (21) 99715-5555"
test_string_2 = "Apple (AAPL) have a great quarterly, but Microsoft (MSFT) have a better one"
test_string_3 = "Not parenthesis"

print(re.findall(my_regexp, test_string_1))
print(re.findall(my_regexp, test_string_2))
print(re.findall(my_regexp, test_string_3))

Output:

>>> python3 test.py 
['21']
['AAPL', 'MSFT']
[]
milanbalazs
  • 4,811
  • 4
  • 23
  • 45