2

I tested the pattern on notepad++ and python regular expression test web sites and it works great. But in python it does not match. regex.search method returns None.

Text:

Û  Downloads      : 20314 times                                      
Û  Language       : English                                       
Û  Format         : srt                                           
Û  Total          : 1 subtitle file                               

Pattern:

^.{1,3}\s+(.*?):\s+(.*?)$

Script:

 with open('file.txt','r',encoding='utf-8') as f:
        string = f.read()
        print(string)
        pattern = r'^.{1,3}\s+(.*?):\s+(.*?)$'
        regex = re.compile(pattern)
        match = regex.search(string,re.UNICODE|re.M)
        print( 'Matching "%s"' % pattern)
        print ('  ', match.group())
        print ('  ', match.groupdict())
Cœur
  • 37,241
  • 25
  • 195
  • 267
RockOnGom
  • 3,893
  • 6
  • 35
  • 53

1 Answers1

2

You need to apply the flags in re.compile() function not in search :

>>> regex = re.compile(pattern,re.U|re.M)
>>> regex.search(st)
<_sre.SRE_Match object at 0x7f367951b2d8>
>>> regex.search(st).group()
u'\u251c\xa2  Downloads      : 20314 times 

If you apply the flags in re.search it will returns None :

>>> regex = re.compile(pattern)
>>> regex.search(st,re.U|re.M).group()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'NoneType' object has no attribute 'group'
Mazdak
  • 105,000
  • 18
  • 159
  • 188