I am working on a Python function that will use regular expressions to find within a sentence the acronym within parentheses and its meaning within the sentence. For example, "The Department of State (DOS) is the United States federal executive department responsible for international relations of the United States."
What I have so far is:
text = "The Department of State (DOS) is the United States federal executive department responsible for international relations of the United States."
pattern = re.compile(r"^(.*?)(?:\((.*)\))?$")
result = ''
for i in pattern.finditer(text):
result += text
print (result)
The output returns the entire text sentence. I am new to using regex and probably misunderstanding the structure. From what I understand, r
will match the characters, the ^
asserts the position at the start of the string, .*?
matches any character, *?
matches between zero and unlimited times, the ?
will match zero or one times, the \(\)
will match the parentheses, and the $
asserts the position at the end. I apologize if I am misunderstanding any of this greatly, I appreciate any help with understanding this.
Thanks!