0

Working with this expression for find the words with at least 3 numbers: \b(?<=\s).*?[0-9]{1}.*?[0-9]{1}.*?[0-9]{1}.*?\b

I tested on Pythex and it works well... but I got NONE as result, can someone help?

infile:

IZN8TEIS
IZN89EIS
F7G74VCT
K8Z5PXJ8
O3HNWT3X
QY8479AG
R12PJ6XH
IZN8TEIS
JCON42W5
with open(infile) as fin, open(outfile, "w+") as fout:
    for line in fin:
        match = re.search(r"\b(?<=\s).*?[0-9]{1}.*?[0-9]{1}.*?[0-9]{1}.*?\b", line)
        *** IF I PRINT HERE MATCH .. ALWAYS NONE
    if match: fout.write(line)
    else: print(line)

Also tested with

pattern = re.compile("\b(?<=\s).*?[0-9]{1}.*?[0-9]{1}.*?[0-9]{1}.*?\b")
pattern.search(line)

same result.

Arseny
  • 933
  • 8
  • 16
  • As a general tip, you can group repeating subpatterns to make the expression more readable; for example `\b(?:[A-Z]*\d){3}\w*\b`. – oriberu Dec 12 '22 at 02:52

1 Answers1

0

You regex is searching for lines that have a preceding whitespace (the (?<=\s) part of the regex).

It works when you test multiple lines as a single string. But in your Python code, you go through the lines in the file one at a time:

    for line in fin:
        ...

None of those lines has preceding whitespace, so the regex can't find a match.

You could join the lines in the file with a "\n" or read the entire contents at once some other way, and then run the search.

Arseny
  • 933
  • 8
  • 16