I use regex to detect numbers from "0" to "999 999 999" inside a string in Python.
import re
test_string = "b\'[<span id=\"prl\">114 893</span>]\'"
working_pattern = "\d{1,3}\s\d{3}"
non_working_pattern = "\d{1,3}(\s\d{3}){0,2}"
wk_ptrn = re.findall(working_pattern, test_string)
non_wk_ptrn = re.findall(non_working_pattern, test_string)
print(wk_ptrn)
print(non_wk_ptrn)
The results are :
print(wk_ptrn)
displays : ['114 893']
print(non_wk_ptrn)
displays : [' 893']
(with a space before the first digit)
The non_working_pattern is "\d{1,3}(\s\d{3}){0,2}"
\d{1,3} :
detects 1 to 3 digits [0 to 999]
\s\d{3} :
detects any white space followed by 3 digits [" 000" to " 999"]
{0,2} :
is a quantifier so I can detect "0" (quantifier = 0)
to "999[ 999][ 999]" (quantifier = 2)
.
I don't understand why "\d{1,3}(\s\d{3}){0,2}
" doesn't work .
Can you please help me figure out the mistake ?
Thank you. Regards.