This is a subtle problem with your script, which the following fixes:
txt = "japan isn't 56 country in Europe."
nt = re.findall(r"n't [0-9]+(?:\.[0-9][0-9]?)?",txt)
print(nt) # prints ["n't 56"]
In your original call to re.findall
, you were using this pattern:
n't [0-9]+(\.[0-9][0-9]?)?
This means that the first capture group is the optional term .123
. With the re.findall
API, if you specify a capture group, then it is what will be returned. Given that your input did not contain this group, your resulting list was empty. In my corrected version, I made the capturing group inactive, using ?:
. If you don't specify any explicit capture groups, then the entire matching pattern will be returned, which is the behavior you want here.