0

I have the following regular expression:

>>> re.findall(r'\r\n\d+\r\n',contents)[-1]
'\r\n1621\r\n'
>>> re.findall(r'\r\n\d+\r\n',contents)[-1].replace('\r','').replace('\n','')
'1621'

How would I improve the regular expression such that I don't need to use the python replace methods?

Note that the digit must be surrounded by those characters, I can't do a straight \d+.

David542
  • 104,438
  • 178
  • 489
  • 842

3 Answers3

2

Simply use parenthesis:

re.findall(r'\r\n(\d+)\r\n',contents)[-1]

That way you match the given pattern and only get the parenthesis content in findall result.

user
  • 5,370
  • 8
  • 47
  • 75
0

user 5061 answer is great.
You can use .strip() to get rid of those "\r\n" special characters.

re.findall(r'\r\n\d+\r\n',contents)[-1].strip()
Community
  • 1
  • 1
0

You could use look-ahead and look-back assertions:

re.findall(r'(?<=\r\n)\d+(?=\r\n)',contents)[-1]
tbodt
  • 16,609
  • 6
  • 58
  • 83