print(nltk.regexp_tokenize('That U.S.A. poster-print costs $12.40...13.10', r"((?:(?:[A-Z]\.)+)|(?:\w+(?:-\w+)*)|(?:\d+(?:\.\d+)?))"))
Outputs:
"['That', 'U.S.A.', 'poster-print', 'costs', '12', '40', '13', '10']"
And (change in the order of the patterns in parentheses):
print(nltk.regexp_tokenize('That U.S.A. poster-print costs $12.40...13.10', r"((?:(?:[A-Z]\.)+)|(?:\d+(?:\.\d+)?)|(?:\w+(?:-\w+)*))"))
Outputs:
['That', 'U.S.A.', 'poster-print', 'costs', '12.40', '13.10']
Why the order in this case matters?